Updated README, note about fucked up wavs
This commit is contained in:
parent
26583323bc
commit
2ed53c6464
27
README.md
27
README.md
@ -12,7 +12,7 @@ The Neuralink N1 implant generates approximately 200 Mbps of electrode data (102
|
||||
|
||||
## Data Analysis
|
||||
|
||||
The `analysis.ipynb` notebook contains a detailed analysis of the data. We found that there is sometimes significant cross-correlation between the different leads, so we find it vital to use this information for better compression. This cross-correlation allows us to improve the accuracy of our predictions and reduce the overall amount of data that needs to be transmitted. As part of the analysis, we also note that achieving a 200x compression ratio is highly unlikely to be possible and is also nonsensical; a very close reproduction is sufficient.
|
||||
The `analysis.ipynb` notebook contains a detailed analysis of the data. We found that there is sometimes significant cross-correlation between the different leads, so we find it vital to use this information for better compression. This cross-correlation allows us to improve the accuracy of our predictions and reduce the overall amount of data that needs to be transmitted.
|
||||
|
||||
## Algorithm Overview
|
||||
|
||||
@ -20,7 +20,7 @@ The `analysis.ipynb` notebook contains a detailed analysis of the data. We found
|
||||
|
||||
As the first step, we analyze readings from the leads to construct an approximate topology of the threads in the brain. The distance metric we generate only approximately represents true Euclidean distances, but rather the 'distance' in common activity. This topology must only be computed once for a given implant and may be updated for thread movements but is not part of the regular compression/decompression process.
|
||||
|
||||
### 2 - Predictive Architecture
|
||||
### 2 - Predictive Model
|
||||
|
||||
The main workhorse of our compression approach is a predictive model running both in the compressor and decompressor. With good predictions of the data, only the error between the prediction and actual data must be transmitted. We make use of the previously constructed topology to allow the predictive model's latent to represent the activity of brain regions based on the reading of the threads instead of just for threads themselves.
|
||||
|
||||
@ -42,6 +42,20 @@ If we were to give up on lossless compression, one could expand MiddleOut to for
|
||||
|
||||
Based on an expected distribution of deltas that have to be transmitted, an efficient Huffman-like binary format is used for encoding the data.
|
||||
|
||||
## On Lossless 200x Compression
|
||||
|
||||
Expecting a 200x compression ratio is ludicrous, as it would mean transmitting only 1 bit per 20 data points. Given the high entropy of the readings, this is an absurd goal. Anyone who thinks lossless 200x compression is remotely feasible has a woefully inadequate grasp of information theory. Please, do yourself a favor and read Shannon’s paper.
|
||||
|
||||
Furthermore, there's no need for lossless compression. These readings feed into an ML model to extract intent, and any such encoder inherently reduces information content with each layer ('intelligence is the ability to disregard irrelevant information'). Instead, compression should be regarded as an integral part of the ML pipeline for intent extraction. It should be allowed to be lossy, with the key being to define the loss metric not by information loss in the input space, but rather in the latent space of the pipeline.
|
||||
|
||||
Let's see how far we can get with the approach presented here...
|
||||
|
||||
On another note: Why is the dataset provided not 10-bit if the readings are? They are all 16-bit. And the last 6 bits are not all zeros. We know they can't encode sensible information when the readings are only 10-bit, but we also can't just throw them away since they do contain something. We also observe that all possible values the data points can take on are separated by 64 or 63 (64 would make sense; 63 very much does not). (See `fucked_up_wavs.py`)
|
||||
|
||||
## On Evaluation
|
||||
|
||||
The provided eval.sh script is also flawed (as in: not aligned with what should be optimized for), since it (a) counts the size of the compressor and decompressor as part of the transmitted data. Especially the decompressor part makes no sense. It also makes it impossible to compress data from multiple threads together, which is required for the free lunch we can get from topological reconstruction.
|
||||
|
||||
## TODO
|
||||
|
||||
- Our flagship bitstream encoder builds an optimal Huffman tree assuming the deltas are binomially distributed. This should be updated when we know a more precise approximation of the delta distribution.
|
||||
@ -72,3 +86,12 @@ To train the model, run:
|
||||
```bash
|
||||
python main.py <config_file.yaml> <exp_name>
|
||||
```
|
||||
|
||||
## Icon Attribution
|
||||
The icon used in this repository is a combination of the Pied Piper logo from the HBO show _Silicon Valley_ and the Neuralink logo. I do not hold any trademarks on either logo; they are owned by their respective entities.
|
||||
|
||||
## License
|
||||
|
||||
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License (CC BY-NC-SA 4.0). For commercial use, please contact me at [mail@dominik-roth.eu](mailto:mail@dominik-roth.eu).
|
||||
|
||||
You can view the full text of the license [here](LICENSE).
|
52
fucked_up_wavs.py
Normal file
52
fucked_up_wavs.py
Normal file
@ -0,0 +1,52 @@
|
||||
import wave
|
||||
import numpy as np
|
||||
import matplotlib.pyplot as plt
|
||||
|
||||
def load_wav(file_path):
|
||||
with wave.open(file_path, 'rb') as wav_file:
|
||||
sample_rate = wav_file.getframerate()
|
||||
num_frames = wav_file.getnframes()
|
||||
num_channels = wav_file.getnchannels()
|
||||
sampwidth = wav_file.getsampwidth()
|
||||
raw_data = wav_file.readframes(num_frames)
|
||||
|
||||
return sample_rate, num_channels, sampwidth, raw_data
|
||||
|
||||
def inspect_wav(file_path):
|
||||
sample_rate, num_channels, sampwidth, raw_data = load_wav(file_path)
|
||||
|
||||
fmt = {1: np.int8, 2: np.int16, 4: np.int32}.get(sampwidth)
|
||||
|
||||
data = np.frombuffer(raw_data, dtype=fmt)
|
||||
|
||||
print(f"Sample Rate: {sample_rate}")
|
||||
print(f"Channels: {num_channels}")
|
||||
print(f"Sample Width: {sampwidth} bytes")
|
||||
|
||||
# Calculate and print max/min values and required bits
|
||||
max_value = np.max(data)
|
||||
min_value = np.min(data)
|
||||
max_bits = np.ceil(np.log2(max_value + 1))
|
||||
min_bits = np.ceil(np.log2(abs(min_value) + 1))
|
||||
|
||||
# Ensure to include the sign bit
|
||||
bits_required = max(max_bits, min_bits) + 1
|
||||
|
||||
print(f"Maximum Value: {max_value}")
|
||||
print(f"Minimum Value: {min_value}")
|
||||
print(f"Bits Required to Represent Maximum Value: {max_bits}")
|
||||
print(f"Bits Required to Represent Minimum Value: {min_bits}")
|
||||
print(f"Total Bits Required (including sign bit): {bits_required}")
|
||||
|
||||
file_path = 'data/d657634f-4d93-410c-8a95-52e2da100a72.wav'
|
||||
inspect_wav(file_path)
|
||||
|
||||
|
||||
# Sample Rate: 19531
|
||||
# Channels: 1
|
||||
# Sample Width: 2 bytes
|
||||
# Maximum Value: 18929
|
||||
# Minimum Value: -9513
|
||||
# Bits Required to Represent Maximum Value: 15.0
|
||||
# Bits Required to Represent Minimum Value: 14.0
|
||||
# Total Bits Required (including sign bit): 16.0
|
Loading…
Reference in New Issue
Block a user