Initial commit
This commit is contained in:
commit
c31fe1e75a
7
.gitignore
vendored
Normal file
7
.gitignore
vendored
Normal file
@ -0,0 +1,7 @@
|
|||||||
|
cache_dir
|
||||||
|
dist
|
||||||
|
*.egg-info
|
||||||
|
*.assets
|
||||||
|
.venv
|
||||||
|
*/__pycache__
|
||||||
|
|
2
MANIFEST.in
Normal file
2
MANIFEST.in
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
include src/magic/hocuspocus.c
|
||||||
|
|
80
README.md
Normal file
80
README.md
Normal file
@ -0,0 +1,80 @@
|
|||||||
|
<h1 align="center">
|
||||||
|
<br>
|
||||||
|
<img src='./icon.svg' width="250px">
|
||||||
|
</h1>
|
||||||
|
|
||||||
|
|
||||||
|
# MAGIC (**M**agic **A**ccelerates via **G**eneral **I**ntercept-based **C**acheing)
|
||||||
|
|
||||||
|
This python module **MAGIC**ally mitigates performance issues on BWUni caused by faulty OS cache configuration.
|
||||||
|
|
||||||
|
The term "**HOCUSPOCUS**" (**H**ocuspocus **O**vercomes **C**onfiguration **U**psies for **S**uperior **P**erformance: **O**ptimized **C**aching via **I**ntercepting **U**serspace **S**yscalls) finds its origins in a misinterpretation from the 17th century, rooted in the Latin phrase "Hoc est corpus" used during the Catholic Mass to signify the transformation of bread into the Body of Christ. To those unfamiliar with Latin, this sacred invocation sounded like mystical jargon, which they mockingly or mistakenly transformed into "hocus pocus."
|
||||||
|
|
||||||
|
## Function
|
||||||
|
|
||||||
|
`hocuspocus(inform_cache_hit=False, inform_cache_miss=False, inform_cache_write=True, filetype_whitelist=".py,.pyc,.so,.dll", file_blacklist="")`
|
||||||
|
|
||||||
|
A function to configure the caching mechanism. It uses RAM and the local SSD for caching. By default, it caches specific file types (.py, .pyc, .so, .dll) and informs about cache writes. Call `hocuspocus()` at the beginning of your script to initialize the caching mechanism. It needs to be called before importing other packages. (Or more accurately: Any code before `magic.hocuspocus` will be run twice, so it must not have any side-effects!). If you want to customize the caching behavior, use the parameters provided.
|
||||||
|
|
||||||
|
### Parameters:
|
||||||
|
- `inform_cache_hit`: Boolean flag to print cache hits (default: False).
|
||||||
|
- `inform_cache_miss`: Boolean flag to print cache misses (default: False).
|
||||||
|
- `inform_cache_write`: Boolean flag to print cache writes (default: True).
|
||||||
|
- `filetype_whitelist`: Comma-separated string of file extensions to cache (default: ".py,.pyc,.so,.dll").
|
||||||
|
- `file_blacklist`: Comma-separated string of file paths to exclude from caching (default: "").
|
||||||
|
|
||||||
|
## Example Usage
|
||||||
|
|
||||||
|
```python
|
||||||
|
import magic
|
||||||
|
magic.hocuspocus()
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import torch as th
|
||||||
|
# Your code here
|
||||||
|
```
|
||||||
|
|
||||||
|
## How It Works
|
||||||
|
|
||||||
|
It's actually not magic... Python relies on the OS cache to keep actively used modules in RAM for quick access. However, an issue at BWUni causes these modules to be evicted repeatedly, leading to frequent and unnecessary reloads that need to pass through the internal network backbone. ATIS has confirmed this as the underlying issue but has yet to find a solution.
|
||||||
|
|
||||||
|
We make us of the `LD_PRELOAD` trick to inject a shared library (`hocuspocus.so`) into the Python process. This library overrides some standard file-related system calls with our custom implementations. These then intercept all file operations, allowing us to listen in and, when passing our white- and blacklists, forge the returned file descriptors to point to a local cache which we automatically populate instead of referencing the original files. (OS-level VFS caching is also functional on these cached copies, so we get a 2 level RAM/SSD cache overall.) This explicit caching prevents the erroneous evictions caused by the misconfiguration; once a module is loaded from the cache, it remains quickly accessible, reducing the overhead of repeated file loading and significantly improving performance.
|
||||||
|
|
||||||
|
This approach results in a significant decrease in training time and an even more significant decrease in the number of automatic e-mails sent by ATIS regarding 'high I/O activity'.
|
||||||
|
|
||||||
|
This package is only meant for python applications; but the provided `hocuspocus.so` could also work on a wide range of other applications. Have a look at our fairly minimal source code if you wanna try to adapt it...
|
||||||
|
|
||||||
|
## Benchmarks
|
||||||
|
|
||||||
|
### Training wall-clock-time reduction for RL workloads
|
||||||
|
|
||||||
|
TODO
|
||||||
|
|
||||||
|
### Automatic ATIS e-mail reduction
|
||||||
|
|
||||||
|
#### Before:
|
||||||
|
|
||||||
|
![mail_before](benchmarks/mail_before.png)
|
||||||
|
|
||||||
|
#### After:
|
||||||
|
|
||||||
|
![mail_after](benchmarks/mail_after.png)
|
||||||
|
|
||||||
|
We achieve a 100% reduction in automatic mails received from ATIS.
|
||||||
|
|
||||||
|
## Authors
|
||||||
|
|
||||||
|
ChatGPT-4o (Lead Developer)
|
||||||
|
Dominik Roth (Manager, Assistant Developer and Benchmarking)
|
||||||
|
|
||||||
|
Questions should primarely be directed at the lead developer (ChatGPT-4o).
|
||||||
|
|
||||||
|
## Donations
|
||||||
|
|
||||||
|
DogeCoin: DGUjmkYd3pzV2ovUydRs6c1dmd6AHV4Aby
|
||||||
|
|
||||||
|
Up to 50% of the total funds received through donations will be forwarded to Sam Altman's 7 trillion USD funding round.
|
||||||
|
|
||||||
|
|
||||||
|
|
||||||
|
*Note: ATIS seems to be highly competence most of the time. This repo is not meant as an attack, it is merely the result of coding while in a silly goofy mood.*
|
BIN
benchmarks/mail_after.png
Normal file
BIN
benchmarks/mail_after.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 20 KiB |
BIN
benchmarks/mail_before.png
Normal file
BIN
benchmarks/mail_before.png
Normal file
Binary file not shown.
After Width: | Height: | Size: 37 KiB |
25
fib.py
Normal file
25
fib.py
Normal file
@ -0,0 +1,25 @@
|
|||||||
|
import magic
|
||||||
|
magic.hocuspocus()
|
||||||
|
|
||||||
|
import numpy as np
|
||||||
|
import pickle
|
||||||
|
import json
|
||||||
|
import sys
|
||||||
|
|
||||||
|
def fib(n):
|
||||||
|
# Define the transformation matrix
|
||||||
|
F = np.array([[1, 1],
|
||||||
|
[1, 0]], dtype=object)
|
||||||
|
|
||||||
|
# Use matrix exponentiation
|
||||||
|
return np.linalg.matrix_power(F, n-1)[0, 0]
|
||||||
|
|
||||||
|
import numpy as np2
|
||||||
|
|
||||||
|
# Example usage
|
||||||
|
n = 100 # Change n to compute a different Fibonacci number
|
||||||
|
print(f"The {n}th Fibonacci number is {fib(n)}")
|
||||||
|
|
||||||
|
print(np.random.mtrand.beta(1,2))
|
||||||
|
|
||||||
|
print(sys.meta_path[0].module_cache.keys())
|
81
icon.svg
Normal file
81
icon.svg
Normal file
@ -0,0 +1,81 @@
|
|||||||
|
<?xml version="1.0" encoding="UTF-8" standalone="no"?>
|
||||||
|
<!-- Created with Inkscape (http://www.inkscape.org/) -->
|
||||||
|
|
||||||
|
<svg
|
||||||
|
width="209.35104mm"
|
||||||
|
height="207.26178mm"
|
||||||
|
viewBox="0 0 209.35104 207.26178"
|
||||||
|
version="1.1"
|
||||||
|
id="svg5"
|
||||||
|
xml:space="preserve"
|
||||||
|
inkscape:version="1.3.2 (091e20ef0f, 2023-11-25)"
|
||||||
|
sodipodi:docname="fancy_magic.svg"
|
||||||
|
xmlns:inkscape="http://www.inkscape.org/namespaces/inkscape"
|
||||||
|
xmlns:sodipodi="http://sodipodi.sourceforge.net/DTD/sodipodi-0.dtd"
|
||||||
|
xmlns="http://www.w3.org/2000/svg"
|
||||||
|
xmlns:svg="http://www.w3.org/2000/svg"><sodipodi:namedview
|
||||||
|
id="namedview7"
|
||||||
|
pagecolor="#ffffff"
|
||||||
|
bordercolor="#999999"
|
||||||
|
borderopacity="1"
|
||||||
|
inkscape:showpageshadow="0"
|
||||||
|
inkscape:pageopacity="0"
|
||||||
|
inkscape:pagecheckerboard="0"
|
||||||
|
inkscape:deskcolor="#d1d1d1"
|
||||||
|
inkscape:document-units="mm"
|
||||||
|
showgrid="false"
|
||||||
|
inkscape:zoom="0.70026212"
|
||||||
|
inkscape:cx="427.69699"
|
||||||
|
inkscape:cy="422.69886"
|
||||||
|
inkscape:window-width="1720"
|
||||||
|
inkscape:window-height="1403"
|
||||||
|
inkscape:window-x="26"
|
||||||
|
inkscape:window-y="23"
|
||||||
|
inkscape:window-maximized="0"
|
||||||
|
inkscape:current-layer="g1"
|
||||||
|
showguides="false" /><defs
|
||||||
|
id="defs2"><linearGradient
|
||||||
|
id="linearGradient3331"
|
||||||
|
inkscape:swatch="solid"><stop
|
||||||
|
style="stop-color:#000000;stop-opacity:1;"
|
||||||
|
offset="0"
|
||||||
|
id="stop3329" /></linearGradient></defs><g
|
||||||
|
inkscape:label="Layer 1"
|
||||||
|
inkscape:groupmode="layer"
|
||||||
|
id="layer1"
|
||||||
|
transform="translate(-0.33538006,-89.544919)"><path
|
||||||
|
id="path243"
|
||||||
|
style="fill:#1b9783;fill-opacity:1;stroke-width:1.82698"
|
||||||
|
d="M 105.01116,89.544922 C 47.200277,89.544813 0.33526959,135.94201 0.33538005,193.17581 0.33527406,250.40961 47.20028,296.8068 105.01116,296.80669 162.82184,296.80652 209.68654,250.40941 209.68643,193.17581 209.68654,135.9422 162.82184,89.545091 105.01116,89.544922 Z"
|
||||||
|
sodipodi:nodetypes="ccccc" /><g
|
||||||
|
id="g8676"
|
||||||
|
transform="matrix(0.35277778,0,0,0.35277778,93.730168,-46.079548)"><path
|
||||||
|
d="m 210.96344,516.0861 12.37693,-4.7124 -12.37693,-4.71237 v -0.002 c -2.44481,-0.93931 -4.69468,-3.36861 -6.45332,-6.97435 -1.75974,-3.6034 -2.94651,-8.21411 -3.4035,-13.22394 l -2.29965,-25.36242 -2.29965,25.36242 h -9.7e-4 c -0.45329,5.01503 -1.63749,9.62822 -3.3973,13.23664 -1.76102,3.60599 -4.01242,6.03262 -6.45951,6.96165 l -12.37693,4.71238 12.39453,4.71498 c 2.44862,0.92368 4.70251,3.35031 6.46212,6.95631 1.76101,3.60862 2.94394,8.22414 3.3947,13.24199 l 2.29965,25.32299 2.29966,-25.32299 h 9.6e-4 c 0.44949,-5.01245 1.63115,-9.62555 3.38851,-13.23197 1.75593,-3.60861 4.0059,-6.03791 6.45071,-6.967 z"
|
||||||
|
id="path8600"
|
||||||
|
style="fill:#ffffff;fill-opacity:1;stroke-width:0.46663" /><path
|
||||||
|
d="m 241.05123,518.51515 c 0.30634,-3.04032 1.58367,-5.41746 3.21463,-5.98793 l 4.02723,-1.3888 -4.02723,-1.3888 c -1.63567,-0.55553 -2.91649,-2.94127 -3.21463,-5.98793 l -0.74559,-7.50158 -0.74558,7.50158 c -0.30634,3.03817 -1.58251,5.41746 -3.21463,5.98793 l -4.02723,1.3888 4.02723,1.3888 c 1.62528,0.59427 2.89569,2.96067 3.21463,5.98793 l 0.74558,7.50158 z"
|
||||||
|
id="path8602-6-7"
|
||||||
|
style="fill:#ffffff;fill-opacity:1;stroke-width:0.403881" /><path
|
||||||
|
d="m 177.89263,495.09636 c 0.25962,-2.6322 1.34213,-4.69025 2.72434,-5.18414 l 3.413,-1.20237 -3.413,-1.20238 c -1.3862,-0.48096 -2.47167,-2.54645 -2.72434,-5.18414 l -0.63187,-6.4946 -0.63187,6.4946 c -0.25961,2.63034 -1.34114,4.69025 -2.72433,5.18414 l -3.413,1.20238 3.413,1.20237 c 1.37739,0.5145 2.45404,2.56324 2.72433,5.18414 l 0.63187,6.4946 z"
|
||||||
|
id="path8602-6-7-6"
|
||||||
|
style="fill:#ffffff;fill-opacity:1;stroke-width:0.345954" /><path
|
||||||
|
d="m 231.84066,571.8627 c 0.32831,-3.0209 1.18695,-5.79883 2.46229,-7.9706 1.27354,-2.17492 2.90247,-3.63925 4.67382,-4.1991 l 8.97453,-2.60256 -8.97453,-2.86707 c -1.76959,-0.55373 -3.39852,-2.00884 -4.67198,-4.17454 -1.27353,-2.1657 -2.134,-4.93741 -2.46408,-7.95247 l -1.73354,-15.52615 -1.65595,15.30485 c -0.33011,3.01474 -1.19055,5.78653 -2.46408,7.95247 -1.27355,2.16594 -2.90249,3.62083 -4.67199,4.17454 l -8.86969,3.08861 8.94728,2.82401 v -0.003 c 1.77139,0.56295 3.40033,2.02725 4.67383,4.19911 1.27533,2.17491 2.13399,4.95284 2.46409,7.97374 l 1.57661,15.04023 z"
|
||||||
|
id="path8604"
|
||||||
|
style="fill:#ffffff;fill-opacity:1;stroke-width:0.603057" /><path
|
||||||
|
d="m 162.84035,471.82125 c 0.48729,-2.70075 1.76171,-5.1843 3.65462,-7.12591 1.89023,-1.94444 4.30794,-3.25359 6.93705,-3.75411 l 13.32028,-2.32675 -13.32028,-2.56324 c -2.6265,-0.49504 -5.04422,-1.79595 -6.93432,-3.73215 -1.89022,-1.93619 -3.16735,-4.41416 -3.65728,-7.10971 l -2.57297,-13.88077 -2.45781,13.68293 c -0.48996,2.69525 -1.76706,5.1733 -3.65729,7.10971 -1.89023,1.93641 -4.30796,3.23711 -6.93431,3.73214 l -13.1647,2.7613 13.27985,2.52474 v -0.003 c 2.62916,0.5033 5.04689,1.81242 6.93705,3.75412 1.89291,1.94442 3.16735,4.42797 3.65729,7.12872 l 2.34007,13.44636 z"
|
||||||
|
id="path8604-3"
|
||||||
|
style="fill:#ffffff;fill-opacity:1;stroke-width:0.694679" /></g><g
|
||||||
|
style="fill:#ffffff"
|
||||||
|
id="g3"
|
||||||
|
transform="matrix(0.21747691,0,0,0.21747691,49.445442,135.78153)"><g
|
||||||
|
id="g2"
|
||||||
|
style="fill:#ffffff">
|
||||||
|
<g
|
||||||
|
id="g1"
|
||||||
|
style="fill:#ffffff">
|
||||||
|
<path
|
||||||
|
d="m 528.2755,197.2411 -53.42959,-73.58885 30.33106,-85.732703 c 2.7299,-7.717696 0.84135,-16.31623 -4.86989,-22.180115 -5.71233,-5.8638986 -14.25849,-7.9758475 -22.04501,-5.448947 l -86.49746,28.075966 -72.16471,-55.33693 c -6.49446,-4.981259 -15.25719,-5.843848 -22.59915,-2.223436 -7.34195,3.62041 -11.9919,11.0965105 -11.99409,19.28174992 l -0.0283,90.93911508 -74.9309,51.53103 c -6.74597,4.63905 -10.27049,12.70457 -9.09624,20.80674 1.17534,8.10109 6.84706,14.83461 14.6323,17.36517 l 63.37456,20.61275 -298.091966,290.38781 c -8.508081,8.28816 -8.686362,21.90414 -0.398201,30.41221 4.14353,4.25457 9.619553,6.42697 15.1219931,6.49902 5.5024392,0.0721 11.0345304,-1.95514 15.2880489,-6.10084 L 308.96985,232.1541 l 18.9464,63.89252 c 2.32586,7.84776 8.90858,13.69382 16.9761,15.08196 1.12309,0.19195 2.24729,0.29458 3.36399,0.3092 6.90917,0.0905 13.52982,-3.15951 17.67379,-8.85755 l 53.47522,-73.55602 90.9087,2.35269 c 8.20656,0.19182 15.77778,-4.24159 19.59024,-11.48513 3.81247,-7.24356 3.18177,-16.02476 -1.62879,-22.65067 z M 409.21759,185.73284 c -7.12489,-0.22859 -13.79018,3.12579 -17.95184,8.85175 l -34.49879,47.45316 -16.67954,-56.24793 c -2.01186,-6.78516 -7.23507,-12.14806 -13.96631,-14.3369 l -55.79192,-18.14652 48.34015,-33.24453 c 5.83297,-4.01138 9.31804,-10.63475 9.31874,-17.71392 l 0.0182,-58.668976 46.55645,35.700284 c 5.61584,4.306319 12.99228,5.573932 19.72776,3.390418 L 450.09282,64.657862 430.52618,119.9669 c -2.36017,6.67381 -1.28721,14.0809 2.87272,19.80975 l 34.47018,47.47415 z"
|
||||||
|
id="path1"
|
||||||
|
style="fill:#ffffff;stroke-width:1.08219" />
|
||||||
|
</g>
|
||||||
|
</g></g></g></svg>
|
After Width: | Height: | Size: 6.7 KiB |
16
install_package.sh
Executable file
16
install_package.sh
Executable file
@ -0,0 +1,16 @@
|
|||||||
|
#!/bin/bash
|
||||||
|
|
||||||
|
pip uninstall magic-cache
|
||||||
|
|
||||||
|
# Remove old build artifacts
|
||||||
|
rm -rf build dist *.egg-info
|
||||||
|
|
||||||
|
# compile interceptor
|
||||||
|
rm magic/hocuspocus.so
|
||||||
|
gcc -shared -fPIC -o magic/hocuspocus.so magic/hocuspocus.c -ldl
|
||||||
|
|
||||||
|
# Build the package
|
||||||
|
python -m build
|
||||||
|
|
||||||
|
# Install the package
|
||||||
|
pip install dist/*.whl
|
2
magic/__init__.py
Normal file
2
magic/__init__.py
Normal file
@ -0,0 +1,2 @@
|
|||||||
|
# magic/__init__.py
|
||||||
|
from .magic import hocuspocus
|
342
magic/hocuspocus.c
Normal file
342
magic/hocuspocus.c
Normal file
@ -0,0 +1,342 @@
|
|||||||
|
#define _GNU_SOURCE
|
||||||
|
#include <dlfcn.h>
|
||||||
|
#include <errno.h>
|
||||||
|
#include <fcntl.h>
|
||||||
|
#include <stdarg.h>
|
||||||
|
#include <stdio.h>
|
||||||
|
#include <stdlib.h>
|
||||||
|
#include <string.h>
|
||||||
|
#include <unistd.h>
|
||||||
|
|
||||||
|
static int (*real_open)(const char *pathname, int flags, ...) = NULL;
|
||||||
|
static int (*real_open64)(const char *pathname, int flags, ...) = NULL;
|
||||||
|
static int (*real_openat)(int dirfd, const char *pathname, int flags,
|
||||||
|
...) = NULL;
|
||||||
|
static FILE *(*real_fopen)(const char *pathname, const char *mode) = NULL;
|
||||||
|
static FILE *(*real_fopen64)(const char *pathname, const char *mode) = NULL;
|
||||||
|
|
||||||
|
static int inform_cache_hit = 0;
|
||||||
|
static int inform_cache_miss = 0;
|
||||||
|
static int inform_cache_write = 1;
|
||||||
|
static int verbose = 0;
|
||||||
|
static char **filetype_whitelist = NULL;
|
||||||
|
static int filetype_whitelist_count = 0;
|
||||||
|
static char **file_blacklist = NULL;
|
||||||
|
static int file_blacklist_count = 0;
|
||||||
|
static char cache_dir[4096] = {0};
|
||||||
|
|
||||||
|
void init_real_functions() {
|
||||||
|
real_open = dlsym(RTLD_NEXT, "open");
|
||||||
|
real_open64 = dlsym(RTLD_NEXT, "open64");
|
||||||
|
real_openat = dlsym(RTLD_NEXT, "openat");
|
||||||
|
real_fopen = dlsym(RTLD_NEXT, "fopen");
|
||||||
|
real_fopen64 = dlsym(RTLD_NEXT, "fopen64");
|
||||||
|
|
||||||
|
if (!real_open || !real_open64 || !real_openat || !real_fopen ||
|
||||||
|
!real_fopen64) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Error: Failed to initialize real functions\n");
|
||||||
|
exit(EXIT_FAILURE);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Real functions initialized successfully\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
void init_config() {
|
||||||
|
verbose = getenv("MAGIC_VERBOSE") ? atoi(getenv("MAGIC_VERBOSE")) : 0;
|
||||||
|
inform_cache_hit = getenv("MAGIC_INFORM_CACHE_HIT")
|
||||||
|
? atoi(getenv("MAGIC_INFORM_CACHE_HIT"))
|
||||||
|
: 0;
|
||||||
|
inform_cache_miss = getenv("MAGIC_INFORM_CACHE_MISS")
|
||||||
|
? atoi(getenv("MAGIC_INFORM_CACHE_MISS"))
|
||||||
|
: 0;
|
||||||
|
inform_cache_write = getenv("MAGIC_INFORM_CACHE_WRITE")
|
||||||
|
? atoi(getenv("MAGIC_INFORM_CACHE_WRITE"))
|
||||||
|
: 1;
|
||||||
|
|
||||||
|
const char *cache_dir_env = getenv("MAGIC_CACHE_DIR");
|
||||||
|
if (cache_dir_env) {
|
||||||
|
strncpy(cache_dir, cache_dir_env, sizeof(cache_dir) - 1);
|
||||||
|
cache_dir[sizeof(cache_dir) - 1] = '\0';
|
||||||
|
}
|
||||||
|
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Config initialized. inform_cache_hit: %d, "
|
||||||
|
"inform_cache_miss: %d, inform_cache_write: %d, verbose: %d\n",
|
||||||
|
inform_cache_hit, inform_cache_miss, inform_cache_write, verbose);
|
||||||
|
}
|
||||||
|
|
||||||
|
char *whitelist = getenv("MAGIC_FILETYPE_WHITELIST");
|
||||||
|
if (whitelist) {
|
||||||
|
filetype_whitelist_count = 1;
|
||||||
|
for (char *p = whitelist; *p; p++) {
|
||||||
|
if (*p == ',') filetype_whitelist_count++;
|
||||||
|
}
|
||||||
|
filetype_whitelist = malloc(filetype_whitelist_count * sizeof(char *));
|
||||||
|
if (!filetype_whitelist) {
|
||||||
|
fprintf(
|
||||||
|
stderr,
|
||||||
|
"[hocuspocus.so] Error: Failed to allocate memory for whitelist\n");
|
||||||
|
exit(EXIT_FAILURE);
|
||||||
|
}
|
||||||
|
filetype_whitelist[0] = strtok(whitelist, ",");
|
||||||
|
for (int i = 1; i < filetype_whitelist_count; i++) {
|
||||||
|
filetype_whitelist[i] = strtok(NULL, ",");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
char *blacklist = getenv("MAGIC_FILE_BLACKLIST");
|
||||||
|
if (blacklist) {
|
||||||
|
file_blacklist_count = 1;
|
||||||
|
for (char *p = blacklist; *p; p++) {
|
||||||
|
if (*p == ',') file_blacklist_count++;
|
||||||
|
}
|
||||||
|
file_blacklist = malloc(file_blacklist_count * sizeof(char *));
|
||||||
|
if (!file_blacklist) {
|
||||||
|
fprintf(
|
||||||
|
stderr,
|
||||||
|
"[hocuspocus.so] Error: Failed to allocate memory for blacklist\n");
|
||||||
|
exit(EXIT_FAILURE);
|
||||||
|
}
|
||||||
|
file_blacklist[0] = strtok(blacklist, ",");
|
||||||
|
for (int i = 1; i < file_blacklist_count; i++) {
|
||||||
|
file_blacklist[i] = strtok(NULL, ",");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
int should_cache(const char *pathname) {
|
||||||
|
for (int i = 0; i < file_blacklist_count; i++) {
|
||||||
|
if (strstr(pathname, file_blacklist[i])) {
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (filetype_whitelist_count == 0) return 1;
|
||||||
|
|
||||||
|
const char *ext = strrchr(pathname, '.');
|
||||||
|
if (ext) {
|
||||||
|
for (int i = 0; i < filetype_whitelist_count; i++) {
|
||||||
|
if (strcmp(ext, filetype_whitelist[i]) == 0) {
|
||||||
|
return 1;
|
||||||
|
}
|
||||||
|
}
|
||||||
|
}
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
int is_cached_file(const char *pathname) {
|
||||||
|
if (!cache_dir[0]) return 0;
|
||||||
|
|
||||||
|
static char cached_path[8192];
|
||||||
|
if (snprintf(cached_path, sizeof(cached_path), "%s/%s", cache_dir,
|
||||||
|
pathname) >= sizeof(cached_path)) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Warning: Cached path is too long and was "
|
||||||
|
"truncated: %s/%s\n",
|
||||||
|
cache_dir, pathname);
|
||||||
|
return 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
return access(cached_path, F_OK) == 0;
|
||||||
|
}
|
||||||
|
|
||||||
|
void copy_to_cache(const char *src, const char *dest) {
|
||||||
|
FILE *src_file = fopen(src, "rb");
|
||||||
|
if (!src_file) {
|
||||||
|
if (inform_cache_write) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Warning: Could not open source file %s: %s\n",
|
||||||
|
src, strerror(errno));
|
||||||
|
}
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
FILE *dest_file = fopen(dest, "wb");
|
||||||
|
if (!dest_file) {
|
||||||
|
if (inform_cache_write) {
|
||||||
|
fprintf(
|
||||||
|
stderr,
|
||||||
|
"[hocuspocus.so] Warning: Could not open destination file %s: %s\n",
|
||||||
|
dest, strerror(errno));
|
||||||
|
}
|
||||||
|
fclose(src_file);
|
||||||
|
return;
|
||||||
|
}
|
||||||
|
|
||||||
|
char buffer[4096];
|
||||||
|
size_t bytes;
|
||||||
|
while ((bytes = fread(buffer, 1, sizeof(buffer), src_file)) > 0) {
|
||||||
|
fwrite(buffer, 1, bytes, dest_file);
|
||||||
|
}
|
||||||
|
|
||||||
|
fclose(src_file);
|
||||||
|
fclose(dest_file);
|
||||||
|
|
||||||
|
if (inform_cache_write) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Cached write: %s\n", dest);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
const char *get_cached_path(const char *pathname) {
|
||||||
|
static char cached_path[8192];
|
||||||
|
|
||||||
|
char safe_path[4096];
|
||||||
|
strncpy(safe_path, pathname, sizeof(safe_path) - 1);
|
||||||
|
safe_path[sizeof(safe_path) - 1] = '\0';
|
||||||
|
for (char *p = safe_path; *p; ++p) {
|
||||||
|
if (*p == '/') *p = '_';
|
||||||
|
}
|
||||||
|
|
||||||
|
if (snprintf(cached_path, sizeof(cached_path), "%s/%s", cache_dir,
|
||||||
|
safe_path) >= sizeof(cached_path)) {
|
||||||
|
fprintf(stderr,
|
||||||
|
"[hocuspocus.so] Warning: Cached path is too long and was "
|
||||||
|
"truncated: %s/%s\n",
|
||||||
|
cache_dir, safe_path);
|
||||||
|
exit(EXIT_FAILURE);
|
||||||
|
}
|
||||||
|
|
||||||
|
return cached_path;
|
||||||
|
}
|
||||||
|
|
||||||
|
const char *get_rerouted_path(const char *pathname) {
|
||||||
|
int should_cache_file = should_cache(pathname);
|
||||||
|
int cache_hit = should_cache_file && is_cached_file(pathname);
|
||||||
|
const char *cached_path = pathname;
|
||||||
|
|
||||||
|
if (should_cache_file) {
|
||||||
|
cached_path = get_cached_path(pathname);
|
||||||
|
if (!cache_hit) {
|
||||||
|
copy_to_cache(pathname, cached_path);
|
||||||
|
}
|
||||||
|
if (cache_hit && inform_cache_hit) {
|
||||||
|
printf("[hocuspocus.so] Using cached path: %s\n", cached_path);
|
||||||
|
} else if (!cache_hit && inform_cache_miss) {
|
||||||
|
printf("[hocuspocus.so] Intercepted open (cache miss): %s\n", pathname);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
if (verbose)
|
||||||
|
fprintf(stderr, "[hocuspocus.so] rerouted path: %s -> %s\n", pathname,
|
||||||
|
cached_path);
|
||||||
|
|
||||||
|
return cached_path;
|
||||||
|
}
|
||||||
|
|
||||||
|
int open_common(const char *pathname, int flags, va_list args) {
|
||||||
|
const char *cached_path = get_rerouted_path(pathname);
|
||||||
|
|
||||||
|
int fd;
|
||||||
|
if (flags & O_CREAT) {
|
||||||
|
mode_t mode = va_arg(args, mode_t);
|
||||||
|
fd = real_open(cached_path, flags, mode);
|
||||||
|
} else {
|
||||||
|
fd = real_open(cached_path, flags);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (fd == -1) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Error: open failed for %s: %s\n",
|
||||||
|
cached_path, strerror(errno));
|
||||||
|
}
|
||||||
|
|
||||||
|
return fd;
|
||||||
|
}
|
||||||
|
|
||||||
|
FILE *fopen_common(const char *pathname, const char *mode, int is_fopen64) {
|
||||||
|
const char *cached_path = get_rerouted_path(pathname);
|
||||||
|
FILE *file;
|
||||||
|
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Attempting to fopen%s: %s with mode: %s\n",
|
||||||
|
is_fopen64 ? "64" : "", cached_path, mode);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (is_fopen64) {
|
||||||
|
file = real_fopen64(cached_path, mode);
|
||||||
|
} else {
|
||||||
|
file = real_fopen(cached_path, mode);
|
||||||
|
}
|
||||||
|
|
||||||
|
if (!file) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Error: fopen%s failed for %s: %s\n",
|
||||||
|
is_fopen64 ? "64" : "", cached_path, strerror(errno));
|
||||||
|
} else {
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] fopen%s succeeded for %s\n",
|
||||||
|
is_fopen64 ? "64" : "", cached_path);
|
||||||
|
}
|
||||||
|
}
|
||||||
|
|
||||||
|
return file;
|
||||||
|
}
|
||||||
|
|
||||||
|
__attribute__((constructor)) void init() {
|
||||||
|
init_config();
|
||||||
|
init_real_functions();
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Initialization complete\n");
|
||||||
|
}
|
||||||
|
}
|
||||||
|
int open(const char *pathname, int flags, ...) {
|
||||||
|
if (verbose)
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Intercepting open: %s\n", pathname);
|
||||||
|
|
||||||
|
va_list args;
|
||||||
|
va_start(args, flags);
|
||||||
|
int fd = open_common(pathname, flags, args);
|
||||||
|
va_end(args);
|
||||||
|
|
||||||
|
return fd;
|
||||||
|
}
|
||||||
|
|
||||||
|
int open64(const char *pathname, int flags, ...) {
|
||||||
|
if (verbose)
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Intercepting open64: %s\n", pathname);
|
||||||
|
|
||||||
|
va_list args;
|
||||||
|
va_start(args, flags);
|
||||||
|
int fd = open_common(pathname, flags, args);
|
||||||
|
va_end(args);
|
||||||
|
|
||||||
|
return fd;
|
||||||
|
}
|
||||||
|
|
||||||
|
int openat(int dirfd, const char *pathname, int flags, ...) {
|
||||||
|
if (verbose)
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Intercepting openat: %s\n", pathname);
|
||||||
|
|
||||||
|
va_list args;
|
||||||
|
va_start(args, flags);
|
||||||
|
int fd = open_common(pathname, flags, args);
|
||||||
|
va_end(args);
|
||||||
|
|
||||||
|
return fd;
|
||||||
|
}
|
||||||
|
|
||||||
|
FILE *fopen(const char *pathname, const char *mode) {
|
||||||
|
if (verbose)
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Intercepting fopen: %s with mode: %s\n",
|
||||||
|
pathname, mode);
|
||||||
|
|
||||||
|
return fopen_common(pathname, mode, 0);
|
||||||
|
}
|
||||||
|
|
||||||
|
FILE *fopen64(const char *pathname, const char *mode) {
|
||||||
|
if (verbose) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] Intercepting fopen64: %s with mode: %s\n",
|
||||||
|
pathname, mode);
|
||||||
|
}
|
||||||
|
|
||||||
|
FILE *file = fopen_common(pathname, mode, 1);
|
||||||
|
|
||||||
|
if (!file && verbose) {
|
||||||
|
fprintf(stderr, "[hocuspocus.so] fopen64 returned NULL for %s\n", pathname);
|
||||||
|
}
|
||||||
|
|
||||||
|
return file;
|
||||||
|
}
|
44
magic/magic.py
Normal file
44
magic/magic.py
Normal file
@ -0,0 +1,44 @@
|
|||||||
|
import os
|
||||||
|
import sys
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
def hocuspocus(inform_cache_hit=False, inform_cache_miss=False, inform_cache_write=True,
|
||||||
|
filetype_whitelist=".py,.pyc,.so,.dll", file_blacklist="", verbose=False):
|
||||||
|
# Check if already active
|
||||||
|
if os.getenv("MAGIC_ACTIVE") == "1":
|
||||||
|
return
|
||||||
|
|
||||||
|
# Gather the current Python executable and script arguments
|
||||||
|
python_executable = sys.executable
|
||||||
|
script_name = sys.argv[0]
|
||||||
|
script_args = sys.argv[1:]
|
||||||
|
|
||||||
|
# Set environment variables for hocuspocus.so
|
||||||
|
env = os.environ.copy()
|
||||||
|
env["MAGIC_ACTIVE"] = "1"
|
||||||
|
env["MAGIC_INFORM_CACHE_HIT"] = "1" if inform_cache_hit else "0"
|
||||||
|
env["MAGIC_INFORM_CACHE_MISS"] = "1" if inform_cache_miss else "0"
|
||||||
|
env["MAGIC_INFORM_CACHE_WRITE"] = "1" if inform_cache_write else "0"
|
||||||
|
env["MAGIC_FILETYPE_WHITELIST"] = filetype_whitelist
|
||||||
|
env["MAGIC_FILE_BLACKLIST"] = file_blacklist
|
||||||
|
env["MAGIC_CACHE_DIR"] = os.path.abspath("cache_dir")
|
||||||
|
env["MAGIC_CURRENT_DIR"] = os.path.abspath(".")
|
||||||
|
env["MAGIC_VERBOSE"] = "1" if verbose else "0"
|
||||||
|
|
||||||
|
# Ensure the cache directory exists
|
||||||
|
os.makedirs(env["MAGIC_CACHE_DIR"], exist_ok=True)
|
||||||
|
|
||||||
|
# Ensure hocuspocus.so exists
|
||||||
|
intercept_path = os.path.abspath(os.path.join(os.path.dirname(__file__), "hocuspocus.so"))
|
||||||
|
if not os.path.exists(intercept_path):
|
||||||
|
raise FileNotFoundError(f"hocuspocus.so not found at {intercept_path}")
|
||||||
|
|
||||||
|
# Re-run the script with LD_PRELOAD set to the intercept library
|
||||||
|
env["LD_PRELOAD"] = intercept_path
|
||||||
|
|
||||||
|
# Re-execute the current script with the same arguments
|
||||||
|
command = [python_executable, script_name] + script_args
|
||||||
|
result = subprocess.run(command, env=env)
|
||||||
|
|
||||||
|
# Exit with the same return code as the subprocess
|
||||||
|
sys.exit(result.returncode)
|
29
setup.py
Normal file
29
setup.py
Normal file
@ -0,0 +1,29 @@
|
|||||||
|
import setuptools
|
||||||
|
import os
|
||||||
|
import subprocess
|
||||||
|
|
||||||
|
class CustomBuild(setuptools.Command):
|
||||||
|
description = "Compile the C extension"
|
||||||
|
user_options = []
|
||||||
|
|
||||||
|
def initialize_options(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def finalize_options(self):
|
||||||
|
pass
|
||||||
|
|
||||||
|
def run(self):
|
||||||
|
cwd = os.path.abspath(os.path.dirname(__file__))
|
||||||
|
subprocess.check_call(["bash", os.path.join(cwd, "magic", "compile.sh")])
|
||||||
|
|
||||||
|
setuptools.setup(
|
||||||
|
name="magic_cache",
|
||||||
|
version="0.1.0",
|
||||||
|
description="A magic caching library for Python",
|
||||||
|
author="Your Name",
|
||||||
|
author_email="your.email@example.com",
|
||||||
|
packages=setuptools.find_packages(where="."),
|
||||||
|
cmdclass={"build_ext": CustomBuild},
|
||||||
|
include_package_data=True,
|
||||||
|
)
|
||||||
|
|
14
trivial.py
Normal file
14
trivial.py
Normal file
@ -0,0 +1,14 @@
|
|||||||
|
import magic
|
||||||
|
magic.hocuspocus(inform_cache_hit=True, inform_cache_miss=True, inform_cache_write=False, verbose=True)
|
||||||
|
|
||||||
|
print('hi')
|
||||||
|
import numpy as np
|
||||||
|
import torch
|
||||||
|
|
||||||
|
print("Running actual code:")
|
||||||
|
np_array = np.array([1, 2, 3])
|
||||||
|
print(f"Numpy array: {np_array}")
|
||||||
|
|
||||||
|
tensor = torch.tensor([1, 2, 3])
|
||||||
|
print(f"Torch tensor: {tensor}")
|
||||||
|
|
Loading…
Reference in New Issue
Block a user