maximuspowers commited on
Commit
49a5ff9
·
verified ·
1 Parent(s): 55ceee2

Upload weight-space autoencoder (encoder + decoder) and configuration

Browse files
Files changed (5) hide show
  1. README.md +42 -0
  2. config.yaml +123 -0
  3. decoder.pt +3 -0
  4. encoder.pt +3 -0
  5. tokenizer_config.json +8 -0
README.md ADDED
@@ -0,0 +1,42 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ ---
2
+ tags:
3
+ - weight-space-learning
4
+ - neural-network-autoencoder
5
+ - autoencoder
6
+ - transformer
7
+ datasets:
8
+ - maximuspowers/muat-fourier-5
9
+ ---
10
+
11
+ # Weight-Space Autoencoder (TRANSFORMER)
12
+
13
+ This model is a weight-space autoencoder trained on neural network activation weights/signatures.
14
+ It includes both an encoder (compresses weights into latent representations) and a decoder (reconstructs weights from latent codes).
15
+
16
+ ## Model Description
17
+
18
+ - **Architecture**: Transformer encoder-decoder
19
+ - **Training Dataset**: maximuspowers/muat-fourier-5
20
+ - **Input Mode**: signature
21
+ - **Latent Dimension**: 256
22
+
23
+ ## Tokenization
24
+
25
+ - **Chunk Size**: 1 weight values per token
26
+ - **Max Tokens**: 512
27
+ - **Metadata**: True
28
+
29
+ ## Training Config
30
+
31
+ - **Loss Function**: contrastive
32
+ - **Optimizer**: adam
33
+ - **Learning Rate**: 0.0001
34
+ - **Batch Size**: 8
35
+
36
+ ## Performance Metrics (Test Set)
37
+
38
+ - **MSE**: 0.298185
39
+ - **MAE**: 0.404015
40
+ - **RMSE**: 0.546063
41
+ - **Cosine Similarity**: 0.6089
42
+ - **R² Score**: 0.2872
config.yaml ADDED
@@ -0,0 +1,123 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ architecture:
2
+ latent_dim: 256
3
+ mlp:
4
+ decoder:
5
+ activation: relu
6
+ batch_norm: true
7
+ dropout: 0.2
8
+ hidden_dims:
9
+ - 256
10
+ - 384
11
+ - 512
12
+ encoder:
13
+ activation: relu
14
+ batch_norm: true
15
+ dropout: 0.2
16
+ hidden_dims:
17
+ - 512
18
+ - 384
19
+ - 256
20
+ token_pooling: mean
21
+ transformer:
22
+ decoder:
23
+ activation: relu
24
+ d_model: 512
25
+ dim_feedforward: 2048
26
+ dropout: 0.1
27
+ num_heads: 8
28
+ num_layers: 6
29
+ encoder:
30
+ activation: relu
31
+ d_model: 512
32
+ dim_feedforward: 2048
33
+ dropout: 0.1
34
+ num_heads: 8
35
+ num_layers: 6
36
+ pooling: mean
37
+ positional_encoding: learned
38
+ type: transformer
39
+ dataloader:
40
+ num_workers: 0
41
+ pin_memory: true
42
+ dataset:
43
+ hf_dataset: maximuspowers/muat-fourier-5
44
+ input_mode: signature
45
+ max_dimensions:
46
+ max_hidden_layers: 6
47
+ max_neurons_per_layer: 8
48
+ max_sequence_length: 5
49
+ neuron_profile:
50
+ methods:
51
+ - fourier
52
+ random_seed: 42
53
+ test_split: 0.1
54
+ train_split: 0.8
55
+ val_split: 0.1
56
+ device:
57
+ type: auto
58
+ evaluation:
59
+ metrics:
60
+ - mse
61
+ - mae
62
+ - rmse
63
+ - cosine_similarity
64
+ - relative_error
65
+ - r2_score
66
+ per_layer_metrics: false
67
+ hub:
68
+ enabled: true
69
+ private: false
70
+ push_logs: true
71
+ push_metrics: true
72
+ push_model: true
73
+ repo_id: maximuspowers/sig-autoencoder-fourier-5-simclr-mse-new
74
+ token: <REDACTED>
75
+ logging:
76
+ checkpoint:
77
+ enabled: true
78
+ mode: min
79
+ monitor: val_loss
80
+ save_best_only: true
81
+ tensorboard:
82
+ auto_launch: true
83
+ enabled: true
84
+ log_interval: 10
85
+ port: 6006
86
+ verbose: true
87
+ loss:
88
+ augmentation_type: noise
89
+ contrast_type: simclr
90
+ dropout_prob: 0.1
91
+ gamma: 0.4
92
+ noise_std: 0.01
93
+ projection_head:
94
+ hidden_dim: 256
95
+ input_dim: 256
96
+ output_dim: 128
97
+ reconstruction_type: mse
98
+ temperature: 0.1
99
+ type: contrastive
100
+ run_dir: /Users/max/Desktop/muat/model_zoo/runs/train-encoder-decoder_config_2025-12-14_00-00-05
101
+ run_log_cleanup: false
102
+ tokenization:
103
+ chunk_size: 1
104
+ granularity: neuron
105
+ include_metadata: true
106
+ max_tokens: 512
107
+ training:
108
+ batch_size: 8
109
+ early_stopping:
110
+ enabled: true
111
+ mode: min
112
+ monitor: val_loss
113
+ patience: 5
114
+ epochs: 5
115
+ learning_rate: 0.0001
116
+ lr_scheduler:
117
+ enabled: true
118
+ factor: 0.5
119
+ min_lr: 1.0e-06
120
+ patience: 3
121
+ type: reduce_on_plateau
122
+ optimizer: adam
123
+ weight_decay: 0.0001
decoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:a13ca74c3d1bffd112b0fc0156a7837c832fa3ad7514a0f4f72fa07cc6d053e2
3
+ size 102545486
encoder.pt ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:7d46982df3162def7569a6cf378470c8a9513c0b7c64934800135c11b3f85ae6
3
+ size 77277228
tokenizer_config.json ADDED
@@ -0,0 +1,8 @@
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "chunk_size": 1,
3
+ "max_tokens": 512,
4
+ "include_metadata": true,
5
+ "metadata_features": 5,
6
+ "token_dim": 11,
7
+ "granularity": "neuron"
8
+ }