Initial commit

This commit is contained in:
Dominik Moritz Roth 2024-06-17 20:55:11 +02:00
commit c31fe1e75a
13 changed files with 642 additions and 0 deletions

7
.gitignore vendored Normal file
View File

@ -0,0 +1,7 @@
cache_dir
dist
*.egg-info
*.assets
.venv
*/__pycache__

2
MANIFEST.in Normal file
View File

@ -0,0 +1,2 @@
include src/magic/hocuspocus.c

80
README.md Normal file
View File

@ -0,0 +1,80 @@
<h1 align="center">
<br>
<img src='./icon.svg' width="250px">
</h1>
# MAGIC (**M**agic **A**ccelerates via **G**eneral **I**ntercept-based **C**acheing)
This python module **MAGIC**ally mitigates performance issues on BWUni caused by faulty OS cache configuration.
The term "**HOCUSPOCUS**" (**H**ocuspocus **O**vercomes **C**onfiguration **U**psies for **S**uperior **P**erformance: **O**ptimized **C**aching via **I**ntercepting **U**serspace **S**yscalls) finds its origins in a misinterpretation from the 17th century, rooted in the Latin phrase "Hoc est corpus" used during the Catholic Mass to signify the transformation of bread into the Body of Christ. To those unfamiliar with Latin, this sacred invocation sounded like mystical jargon, which they mockingly or mistakenly transformed into "hocus pocus."
## Function
`hocuspocus(inform_cache_hit=False, inform_cache_miss=False, inform_cache_write=True, filetype_whitelist=".py,.pyc,.so,.dll", file_blacklist="")`
A function to configure the caching mechanism. It uses RAM and the local SSD for caching. By default, it caches specific file types (.py, .pyc, .so, .dll) and informs about cache writes. Call `hocuspocus()` at the beginning of your script to initialize the caching mechanism. It needs to be called before importing other packages. (Or more accurately: Any code before `magic.hocuspocus` will be run twice, so it must not have any side-effects!). If you want to customize the caching behavior, use the parameters provided.
### Parameters:
- `inform_cache_hit`: Boolean flag to print cache hits (default: False).
- `inform_cache_miss`: Boolean flag to print cache misses (default: False).
- `inform_cache_write`: Boolean flag to print cache writes (default: True).
- `filetype_whitelist`: Comma-separated string of file extensions to cache (default: ".py,.pyc,.so,.dll").
- `file_blacklist`: Comma-separated string of file paths to exclude from caching (default: "").
## Example Usage
```python
import magic
magic.hocuspocus()
import numpy as np
import torch as th
# Your code here
```
## How It Works
It's actually not magic... Python relies on the OS cache to keep actively used modules in RAM for quick access. However, an issue at BWUni causes these modules to be evicted repeatedly, leading to frequent and unnecessary reloads that need to pass through the internal network backbone. ATIS has confirmed this as the underlying issue but has yet to find a solution.
We make us of the `LD_PRELOAD` trick to inject a shared library (`hocuspocus.so`) into the Python process. This library overrides some standard file-related system calls with our custom implementations. These then intercept all file operations, allowing us to listen in and, when passing our white- and blacklists, forge the returned file descriptors to point to a local cache which we automatically populate instead of referencing the original files. (OS-level VFS caching is also functional on these cached copies, so we get a 2 level RAM/SSD cache overall.) This explicit caching prevents the erroneous evictions caused by the misconfiguration; once a module is loaded from the cache, it remains quickly accessible, reducing the overhead of repeated file loading and significantly improving performance.
This approach results in a significant decrease in training time and an even more significant decrease in the number of automatic e-mails sent by ATIS regarding 'high I/O activity'.
This package is only meant for python applications; but the provided `hocuspocus.so` could also work on a wide range of other applications. Have a look at our fairly minimal source code if you wanna try to adapt it...
## Benchmarks
### Training wall-clock-time reduction for RL workloads
TODO
### Automatic ATIS e-mail reduction
#### Before:
![mail_before](benchmarks/mail_before.png)
#### After:
![mail_after](benchmarks/mail_after.png)
We achieve a 100% reduction in automatic mails received from ATIS.
## Authors
ChatGPT-4o (Lead Developer)
Dominik Roth (Manager, Assistant Developer and Benchmarking)
Questions should primarely be directed at the lead developer (ChatGPT-4o).
## Donations
DogeCoin: DGUjmkYd3pzV2ovUydRs6c1dmd6AHV4Aby
Up to 50% of the total funds received through donations will be forwarded to Sam Altman's 7 trillion USD funding round.
*Note: ATIS seems to be highly competence most of the time. This repo is not meant as an attack, it is merely the result of coding while in a silly goofy mood.*

BIN
benchmarks/mail_after.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 20 KiB

BIN
benchmarks/mail_before.png Normal file

Binary file not shown.

After

Width:  |  Height:  |  Size: 37 KiB

25
fib.py Normal file
View File

@ -0,0 +1,25 @@
import magic
magic.hocuspocus()
import numpy as np
import pickle
import json
import sys
def fib(n):
# Define the transformation matrix
F = np.array([[1, 1],
[1, 0]], dtype=object)
# Use matrix exponentiation
return np.linalg.matrix_power(F, n-1)[0, 0]
import numpy as np2
# Example usage
n = 100 # Change n to compute a different Fibonacci number
print(f"The {n}th Fibonacci number is {fib(n)}")
print(np.random.mtrand.beta(1,2))
print(sys.meta_path[0].module_cache.keys())

81
icon.svg Normal file
View File

@ -0,0 +1,81 @@
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
<!-- Created with Inkscape (http://www.inkscape.org/) -->
<svg
width="209.35104mm"
height="207.26178mm"
viewBox="0 0 209.35104 207.26178"
version="1.1"
id="svg5"
xml:space="preserve"
inkscape:version="1.3.2 (091e20ef0f, 2023-11-25)"
sodipodi:docname="fancy_magic.svg"
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
xmlns="http://www.w3.org/2000/svg"
xmlns:svg="http://www.w3.org/2000/svg"><sodipodi:namedview
id="namedview7"
pagecolor="#ffffff"
bordercolor="#999999"
borderopacity="1"
inkscape:showpageshadow="0"
inkscape:pageopacity="0"
inkscape:pagecheckerboard="0"
inkscape:deskcolor="#d1d1d1"
inkscape:document-units="mm"
showgrid="false"
inkscape:zoom="0.70026212"
inkscape:cx="427.69699"
inkscape:cy="422.69886"
inkscape:window-width="1720"
inkscape:window-height="1403"
inkscape:window-x="26"
inkscape:window-y="23"
inkscape:window-maximized="0"
inkscape:current-layer="g1"
showguides="false" /><defs
id="defs2"><linearGradient
id="linearGradient3331"
inkscape:swatch="solid"><stop
style="stop-color:#000000;stop-opacity:1;"
offset="0"
id="stop3329" /></linearGradient></defs><g
inkscape:label="Layer 1"
inkscape:groupmode="layer"
id="layer1"
transform="translate(-0.33538006,-89.544919)"><path
id="path243"
style="fill:#1b9783;fill-opacity:1;stroke-width:1.82698"
d="M 105.01116,89.544922 C 47.200277,89.544813 0.33526959,135.94201 0.33538005,193.17581 0.33527406,250.40961 47.20028,296.8068 105.01116,296.80669 162.82184,296.80652 209.68654,250.40941 209.68643,193.17581 209.68654,135.9422 162.82184,89.545091 105.01116,89.544922 Z"
sodipodi:nodetypes="ccccc" /><g
id="g8676"
transform="matrix(0.35277778,0,0,0.35277778,93.730168,-46.079548)"><path
d="m 210.96344,516.0861 12.37693,-4.7124 -12.37693,-4.71237 v -0.002 c -2.44481,-0.93931 -4.69468,-3.36861 -6.45332,-6.97435 -1.75974,-3.6034 -2.94651,-8.21411 -3.4035,-13.22394 l -2.29965,-25.36242 -2.29965,25.36242 h -9.7e-4 c -0.45329,5.01503 -1.63749,9.62822 -3.3973,13.23664 -1.76102,3.60599 -4.01242,6.03262 -6.45951,6.96165 l -12.37693,4.71238 12.39453,4.71498 c 2.44862,0.92368 4.70251,3.35031 6.46212,6.95631 1.76101,3.60862 2.94394,8.22414 3.3947,13.24199 l 2.29965,25.32299 2.29966,-25.32299 h 9.6e-4 c 0.44949,-5.01245 1.63115,-9.62555 3.38851,-13.23197 1.75593,-3.60861 4.0059,-6.03791 6.45071,-6.967 z"
id="path8600"
style="fill:#ffffff;fill-opacity:1;stroke-width:0.46663" /><path
d="m 241.05123,518.51515 c 0.30634,-3.04032 1.58367,-5.41746 3.21463,-5.98793 l 4.02723,-1.3888 -4.02723,-1.3888 c -1.63567,-0.55553 -2.91649,-2.94127 -3.21463,-5.98793 l -0.74559,-7.50158 -0.74558,7.50158 c -0.30634,3.03817 -1.58251,5.41746 -3.21463,5.98793 l -4.02723,1.3888 4.02723,1.3888 c 1.62528,0.59427 2.89569,2.96067 3.21463,5.98793 l 0.74558,7.50158 z"
id="path8602-6-7"
style="fill:#ffffff;fill-opacity:1;stroke-width:0.403881" /><path
d="m 177.89263,495.09636 c 0.25962,-2.6322 1.34213,-4.69025 2.72434,-5.18414 l 3.413,-1.20237 -3.413,-1.20238 c -1.3862,-0.48096 -2.47167,-2.54645 -2.72434,-5.18414 l -0.63187,-6.4946 -0.63187,6.4946 c -0.25961,2.63034 -1.34114,4.69025 -2.72433,5.18414 l -3.413,1.20238 3.413,1.20237 c 1.37739,0.5145 2.45404,2.56324 2.72433,5.18414 l 0.63187,6.4946 z"
id="path8602-6-7-6"
style="fill:#ffffff;fill-opacity:1;stroke-width:0.345954" /><path
d="m 231.84066,571.8627 c 0.32831,-3.0209 1.18695,-5.79883 2.46229,-7.9706 1.27354,-2.17492 2.90247,-3.63925 4.67382,-4.1991 l 8.97453,-2.60256 -8.97453,-2.86707 c -1.76959,-0.55373 -3.39852,-2.00884 -4.67198,-4.17454 -1.27353,-2.1657 -2.134,-4.93741 -2.46408,-7.95247 l -1.73354,-15.52615 -1.65595,15.30485 c -0.33011,3.01474 -1.19055,5.78653 -2.46408,7.95247 -1.27355,2.16594 -2.90249,3.62083 -4.67199,4.17454 l -8.86969,3.08861 8.94728,2.82401 v -0.003 c 1.77139,0.56295 3.40033,2.02725 4.67383,4.19911 1.27533,2.17491 2.13399,4.95284 2.46409,7.97374 l 1.57661,15.04023 z"
id="path8604"
style="fill:#ffffff;fill-opacity:1;stroke-width:0.603057" /><path
d="m 162.84035,471.82125 c 0.48729,-2.70075 1.76171,-5.1843 3.65462,-7.12591 1.89023,-1.94444 4.30794,-3.25359 6.93705,-3.75411 l 13.32028,-2.32675 -13.32028,-2.56324 c -2.6265,-0.49504 -5.04422,-1.79595 -6.93432,-3.73215 -1.89022,-1.93619 -3.16735,-4.41416 -3.65728,-7.10971 l -2.57297,-13.88077 -2.45781,13.68293 c -0.48996,2.69525 -1.76706,5.1733 -3.65729,7.10971 -1.89023,1.93641 -4.30796,3.23711 -6.93431,3.73214 l -13.1647,2.7613 13.27985,2.52474 v -0.003 c 2.62916,0.5033 5.04689,1.81242 6.93705,3.75412 1.89291,1.94442 3.16735,4.42797 3.65729,7.12872 l 2.34007,13.44636 z"
id="path8604-3"
style="fill:#ffffff;fill-opacity:1;stroke-width:0.694679" /></g><g
style="fill:#ffffff"
id="g3"
transform="matrix(0.21747691,0,0,0.21747691,49.445442,135.78153)"><g
id="g2"
style="fill:#ffffff">
<g
id="g1"
style="fill:#ffffff">
<path
d="m 528.2755,197.2411 -53.42959,-73.58885 30.33106,-85.732703 c 2.7299,-7.717696 0.84135,-16.31623 -4.86989,-22.180115 -5.71233,-5.8638986 -14.25849,-7.9758475 -22.04501,-5.448947 l -86.49746,28.075966 -72.16471,-55.33693 c -6.49446,-4.981259 -15.25719,-5.843848 -22.59915,-2.223436 -7.34195,3.62041 -11.9919,11.0965105 -11.99409,19.28174992 l -0.0283,90.93911508 -74.9309,51.53103 c -6.74597,4.63905 -10.27049,12.70457 -9.09624,20.80674 1.17534,8.10109 6.84706,14.83461 14.6323,17.36517 l 63.37456,20.61275 -298.091966,290.38781 c -8.508081,8.28816 -8.686362,21.90414 -0.398201,30.41221 4.14353,4.25457 9.619553,6.42697 15.1219931,6.49902 5.5024392,0.0721 11.0345304,-1.95514 15.2880489,-6.10084 L 308.96985,232.1541 l 18.9464,63.89252 c 2.32586,7.84776 8.90858,13.69382 16.9761,15.08196 1.12309,0.19195 2.24729,0.29458 3.36399,0.3092 6.90917,0.0905 13.52982,-3.15951 17.67379,-8.85755 l 53.47522,-73.55602 90.9087,2.35269 c 8.20656,0.19182 15.77778,-4.24159 19.59024,-11.48513 3.81247,-7.24356 3.18177,-16.02476 -1.62879,-22.65067 z M 409.21759,185.73284 c -7.12489,-0.22859 -13.79018,3.12579 -17.95184,8.85175 l -34.49879,47.45316 -16.67954,-56.24793 c -2.01186,-6.78516 -7.23507,-12.14806 -13.96631,-14.3369 l -55.79192,-18.14652 48.34015,-33.24453 c 5.83297,-4.01138 9.31804,-10.63475 9.31874,-17.71392 l 0.0182,-58.668976 46.55645,35.700284 c 5.61584,4.306319 12.99228,5.573932 19.72776,3.390418 L 450.09282,64.657862 430.52618,119.9669 c -2.36017,6.67381 -1.28721,14.0809 2.87272,19.80975 l 34.47018,47.47415 z"
id="path1"
style="fill:#ffffff;stroke-width:1.08219" />
</g>
</g></g></g></svg>

After

Width:  |  Height:  |  Size: 6.7 KiB

16
install_package.sh Executable file
View File

@ -0,0 +1,16 @@
#!/bin/bash
pip uninstall magic-cache
# Remove old build artifacts
rm -rf build dist *.egg-info
# compile interceptor
rm magic/hocuspocus.so
gcc -shared -fPIC -o magic/hocuspocus.so magic/hocuspocus.c -ldl
# Build the package
python -m build
# Install the package
pip install dist/*.whl

2
magic/__init__.py Normal file
View File

@ -0,0 +1,2 @@
# magic/__init__.py
from .magic import hocuspocus

342
magic/hocuspocus.c Normal file
View File

@ -0,0 +1,342 @@
#define _GNU_SOURCE
#include <dlfcn.h>
#include <errno.h>
#include <fcntl.h>
#include <stdarg.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <unistd.h>
static int (*real_open)(const char *pathname, int flags, ...) = NULL;
static int (*real_open64)(const char *pathname, int flags, ...) = NULL;
static int (*real_openat)(int dirfd, const char *pathname, int flags,
...) = NULL;
static FILE *(*real_fopen)(const char *pathname, const char *mode) = NULL;
static FILE *(*real_fopen64)(const char *pathname, const char *mode) = NULL;
static int inform_cache_hit = 0;
static int inform_cache_miss = 0;
static int inform_cache_write = 1;
static int verbose = 0;
static char **filetype_whitelist = NULL;
static int filetype_whitelist_count = 0;
static char **file_blacklist = NULL;
static int file_blacklist_count = 0;
static char cache_dir[4096] = {0};
void init_real_functions() {
real_open = dlsym(RTLD_NEXT, "open");
real_open64 = dlsym(RTLD_NEXT, "open64");
real_openat = dlsym(RTLD_NEXT, "openat");
real_fopen = dlsym(RTLD_NEXT, "fopen");
real_fopen64 = dlsym(RTLD_NEXT, "fopen64");
if (!real_open || !real_open64 || !real_openat || !real_fopen ||
!real_fopen64) {
fprintf(stderr,
"[hocuspocus.so] Error: Failed to initialize real functions\n");
exit(EXIT_FAILURE);
}
if (verbose) {
fprintf(stderr,
"[hocuspocus.so] Real functions initialized successfully\n");
}
}
void init_config() {
verbose = getenv("MAGIC_VERBOSE") ? atoi(getenv("MAGIC_VERBOSE")) : 0;
inform_cache_hit = getenv("MAGIC_INFORM_CACHE_HIT")
? atoi(getenv("MAGIC_INFORM_CACHE_HIT"))
: 0;
inform_cache_miss = getenv("MAGIC_INFORM_CACHE_MISS")
? atoi(getenv("MAGIC_INFORM_CACHE_MISS"))
: 0;
inform_cache_write = getenv("MAGIC_INFORM_CACHE_WRITE")
? atoi(getenv("MAGIC_INFORM_CACHE_WRITE"))
: 1;
const char *cache_dir_env = getenv("MAGIC_CACHE_DIR");
if (cache_dir_env) {
strncpy(cache_dir, cache_dir_env, sizeof(cache_dir) - 1);
cache_dir[sizeof(cache_dir) - 1] = '\0';
}
if (verbose) {
fprintf(stderr,
"[hocuspocus.so] Config initialized. inform_cache_hit: %d, "
"inform_cache_miss: %d, inform_cache_write: %d, verbose: %d\n",
inform_cache_hit, inform_cache_miss, inform_cache_write, verbose);
}
char *whitelist = getenv("MAGIC_FILETYPE_WHITELIST");
if (whitelist) {
filetype_whitelist_count = 1;
for (char *p = whitelist; *p; p++) {
if (*p == ',') filetype_whitelist_count++;
}
filetype_whitelist = malloc(filetype_whitelist_count * sizeof(char *));
if (!filetype_whitelist) {
fprintf(
stderr,
"[hocuspocus.so] Error: Failed to allocate memory for whitelist\n");
exit(EXIT_FAILURE);
}
filetype_whitelist[0] = strtok(whitelist, ",");
for (int i = 1; i < filetype_whitelist_count; i++) {
filetype_whitelist[i] = strtok(NULL, ",");
}
}
char *blacklist = getenv("MAGIC_FILE_BLACKLIST");
if (blacklist) {
file_blacklist_count = 1;
for (char *p = blacklist; *p; p++) {
if (*p == ',') file_blacklist_count++;
}
file_blacklist = malloc(file_blacklist_count * sizeof(char *));
if (!file_blacklist) {
fprintf(
stderr,
"[hocuspocus.so] Error: Failed to allocate memory for blacklist\n");
exit(EXIT_FAILURE);
}
file_blacklist[0] = strtok(blacklist, ",");
for (int i = 1; i < file_blacklist_count; i++) {
file_blacklist[i] = strtok(NULL, ",");
}
}
}
int should_cache(const char *pathname) {
for (int i = 0; i < file_blacklist_count; i++) {
if (strstr(pathname, file_blacklist[i])) {
return 0;
}
}
if (filetype_whitelist_count == 0) return 1;
const char *ext = strrchr(pathname, '.');
if (ext) {
for (int i = 0; i < filetype_whitelist_count; i++) {
if (strcmp(ext, filetype_whitelist[i]) == 0) {
return 1;
}
}
}
return 0;
}
int is_cached_file(const char *pathname) {
if (!cache_dir[0]) return 0;
static char cached_path[8192];
if (snprintf(cached_path, sizeof(cached_path), "%s/%s", cache_dir,
pathname) >= sizeof(cached_path)) {
fprintf(stderr,
"[hocuspocus.so] Warning: Cached path is too long and was "
"truncated: %s/%s\n",
cache_dir, pathname);
return 0;
}
return access(cached_path, F_OK) == 0;
}
void copy_to_cache(const char *src, const char *dest) {
FILE *src_file = fopen(src, "rb");
if (!src_file) {
if (inform_cache_write) {
fprintf(stderr,
"[hocuspocus.so] Warning: Could not open source file %s: %s\n",
src, strerror(errno));
}
return;
}
FILE *dest_file = fopen(dest, "wb");
if (!dest_file) {
if (inform_cache_write) {
fprintf(
stderr,
"[hocuspocus.so] Warning: Could not open destination file %s: %s\n",
dest, strerror(errno));
}
fclose(src_file);
return;
}
char buffer[4096];
size_t bytes;
while ((bytes = fread(buffer, 1, sizeof(buffer), src_file)) > 0) {
fwrite(buffer, 1, bytes, dest_file);
}
fclose(src_file);
fclose(dest_file);
if (inform_cache_write) {
fprintf(stderr, "[hocuspocus.so] Cached write: %s\n", dest);
}
}
const char *get_cached_path(const char *pathname) {
static char cached_path[8192];
char safe_path[4096];
strncpy(safe_path, pathname, sizeof(safe_path) - 1);
safe_path[sizeof(safe_path) - 1] = '\0';
for (char *p = safe_path; *p; ++p) {
if (*p == '/') *p = '_';
}
if (snprintf(cached_path, sizeof(cached_path), "%s/%s", cache_dir,
safe_path) >= sizeof(cached_path)) {
fprintf(stderr,
"[hocuspocus.so] Warning: Cached path is too long and was "
"truncated: %s/%s\n",
cache_dir, safe_path);
exit(EXIT_FAILURE);
}
return cached_path;
}
const char *get_rerouted_path(const char *pathname) {
int should_cache_file = should_cache(pathname);
int cache_hit = should_cache_file && is_cached_file(pathname);
const char *cached_path = pathname;
if (should_cache_file) {
cached_path = get_cached_path(pathname);
if (!cache_hit) {
copy_to_cache(pathname, cached_path);
}
if (cache_hit && inform_cache_hit) {
printf("[hocuspocus.so] Using cached path: %s\n", cached_path);
} else if (!cache_hit && inform_cache_miss) {
printf("[hocuspocus.so] Intercepted open (cache miss): %s\n", pathname);
}
}
if (verbose)
fprintf(stderr, "[hocuspocus.so] rerouted path: %s -> %s\n", pathname,
cached_path);
return cached_path;
}
int open_common(const char *pathname, int flags, va_list args) {
const char *cached_path = get_rerouted_path(pathname);
int fd;
if (flags & O_CREAT) {
mode_t mode = va_arg(args, mode_t);
fd = real_open(cached_path, flags, mode);
} else {
fd = real_open(cached_path, flags);
}
if (fd == -1) {
fprintf(stderr, "[hocuspocus.so] Error: open failed for %s: %s\n",
cached_path, strerror(errno));
}
return fd;
}
FILE *fopen_common(const char *pathname, const char *mode, int is_fopen64) {
const char *cached_path = get_rerouted_path(pathname);
FILE *file;
if (verbose) {
fprintf(stderr, "[hocuspocus.so] Attempting to fopen%s: %s with mode: %s\n",
is_fopen64 ? "64" : "", cached_path, mode);
}
if (is_fopen64) {
file = real_fopen64(cached_path, mode);
} else {
file = real_fopen(cached_path, mode);
}
if (!file) {
fprintf(stderr, "[hocuspocus.so] Error: fopen%s failed for %s: %s\n",
is_fopen64 ? "64" : "", cached_path, strerror(errno));
} else {
if (verbose) {
fprintf(stderr, "[hocuspocus.so] fopen%s succeeded for %s\n",
is_fopen64 ? "64" : "", cached_path);
}
}
return file;
}
__attribute__((constructor)) void init() {
init_config();
init_real_functions();
if (verbose) {
fprintf(stderr, "[hocuspocus.so] Initialization complete\n");
}
}
int open(const char *pathname, int flags, ...) {
if (verbose)
fprintf(stderr, "[hocuspocus.so] Intercepting open: %s\n", pathname);
va_list args;
va_start(args, flags);
int fd = open_common(pathname, flags, args);
va_end(args);
return fd;
}
int open64(const char *pathname, int flags, ...) {
if (verbose)
fprintf(stderr, "[hocuspocus.so] Intercepting open64: %s\n", pathname);
va_list args;
va_start(args, flags);
int fd = open_common(pathname, flags, args);
va_end(args);
return fd;
}
int openat(int dirfd, const char *pathname, int flags, ...) {
if (verbose)
fprintf(stderr, "[hocuspocus.so] Intercepting openat: %s\n", pathname);
va_list args;
va_start(args, flags);
int fd = open_common(pathname, flags, args);
va_end(args);
return fd;
}
FILE *fopen(const char *pathname, const char *mode) {
if (verbose)
fprintf(stderr, "[hocuspocus.so] Intercepting fopen: %s with mode: %s\n",
pathname, mode);
return fopen_common(pathname, mode, 0);
}
FILE *fopen64(const char *pathname, const char *mode) {
if (verbose) {
fprintf(stderr, "[hocuspocus.so] Intercepting fopen64: %s with mode: %s\n",
pathname, mode);
}
FILE *file = fopen_common(pathname, mode, 1);
if (!file && verbose) {
fprintf(stderr, "[hocuspocus.so] fopen64 returned NULL for %s\n", pathname);
}
return file;
}

44
magic/magic.py Normal file
View File

@ -0,0 +1,44 @@
import os
import sys
import subprocess
def hocuspocus(inform_cache_hit=False, inform_cache_miss=False, inform_cache_write=True,
filetype_whitelist=".py,.pyc,.so,.dll", file_blacklist="", verbose=False):
# Check if already active
if os.getenv("MAGIC_ACTIVE") == "1":
return
# Gather the current Python executable and script arguments
python_executable = sys.executable
script_name = sys.argv[0]
script_args = sys.argv[1:]
# Set environment variables for hocuspocus.so
env = os.environ.copy()
env["MAGIC_ACTIVE"] = "1"
env["MAGIC_INFORM_CACHE_HIT"] = "1" if inform_cache_hit else "0"
env["MAGIC_INFORM_CACHE_MISS"] = "1" if inform_cache_miss else "0"
env["MAGIC_INFORM_CACHE_WRITE"] = "1" if inform_cache_write else "0"
env["MAGIC_FILETYPE_WHITELIST"] = filetype_whitelist
env["MAGIC_FILE_BLACKLIST"] = file_blacklist
env["MAGIC_CACHE_DIR"] = os.path.abspath("cache_dir")
env["MAGIC_CURRENT_DIR"] = os.path.abspath(".")
env["MAGIC_VERBOSE"] = "1" if verbose else "0"
# Ensure the cache directory exists
os.makedirs(env["MAGIC_CACHE_DIR"], exist_ok=True)
# Ensure hocuspocus.so exists
intercept_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "hocuspocus.so"))
if not os.path.exists(intercept_path):
raise FileNotFoundError(f"hocuspocus.so not found at {intercept_path}")
# Re-run the script with LD_PRELOAD set to the intercept library
env["LD_PRELOAD"] = intercept_path
# Re-execute the current script with the same arguments
command = [python_executable, script_name] + script_args
result = subprocess.run(command, env=env)
# Exit with the same return code as the subprocess
sys.exit(result.returncode)

29
setup.py Normal file
View File

@ -0,0 +1,29 @@
import setuptools
import os
import subprocess
class CustomBuild(setuptools.Command):
description = "Compile the C extension"
user_options = []
def initialize_options(self):
pass
def finalize_options(self):
pass
def run(self):
cwd = os.path.abspath(os.path.dirname(__file__))
subprocess.check_call(["bash", os.path.join(cwd, "magic", "compile.sh")])
setuptools.setup(
name="magic_cache",
version="0.1.0",
description="A magic caching library for Python",
author="Your Name",
author_email="your.email@example.com",
packages=setuptools.find_packages(where="."),
cmdclass={"build_ext": CustomBuild},
include_package_data=True,
)

14
trivial.py Normal file
View File

@ -0,0 +1,14 @@
import magic
magic.hocuspocus(inform_cache_hit=True, inform_cache_miss=True, inform_cache_write=False, verbose=True)
print('hi')
import numpy as np
import torch
print("Running actual code:")
np_array = np.array([1, 2, 3])
print(f"Numpy array: {np_array}")
tensor = torch.tensor([1, 2, 3])
print(f"Torch tensor: {tensor}")