unmodeled-tyler commited on
Commit
da205ef
·
verified ·
1 Parent(s): 6bd026d

Initial Commit

Browse files
.gitattributes CHANGED
@@ -33,3 +33,4 @@ saved_model/**/* filter=lfs diff=lfs merge=lfs -text
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
 
 
33
  *.zip filter=lfs diff=lfs merge=lfs -text
34
  *.zst filter=lfs diff=lfs merge=lfs -text
35
  *tfevents* filter=lfs diff=lfs merge=lfs -text
36
+ tokenizer.json filter=lfs diff=lfs merge=lfs -text
LICENSE ADDED
@@ -0,0 +1,76 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ Gemma Terms of Use
2
+
3
+ Last modified: November 13, 2024
4
+
5
+ This is a human-readable summary of (and not a substitute for) the license terms.
6
+
7
+ WHAT YOU CAN DO:
8
+ • Use the model for personal, research, and commercial purposes
9
+ • Modify and create derivative works
10
+ • Distribute your modifications
11
+
12
+ WHAT YOU MUST DO:
13
+ • Give appropriate credit to Google
14
+ • Include copyright notice and license terms
15
+ • Indicate if changes were made
16
+
17
+ WHAT YOU CANNOT DO:
18
+ • Hold Google liable
19
+ • Use Google trademarks without permission
20
+ • Claim Google endorses your use
21
+
22
+ ---
23
+
24
+ GEMMA TERMS OF USE
25
+
26
+ Effective Date: November 13, 2024
27
+
28
+ 1. INTRODUCTION
29
+
30
+ These terms ("Terms") govern your use of Gemma, a family of lightweight, state-of-the-art open models built by Google DeepMind. By using Gemma, you agree to these Terms. Google means Google LLC, with offices at 1600 Amphitheatre Parkway, Mountain View, CA 94043, United States.
31
+
32
+ 2. USE OF GEMMA
33
+
34
+ Subject to your compliance with these Terms and applicable law, Google grants you a non-exclusive, worldwide, non-transferable, non-sublicensable, revocable, royalty-free license to use, reproduce, modify, and distribute Gemma.
35
+
36
+ 3. DISTRIBUTION AND REDISTRIBUTION
37
+
38
+ You may distribute or make available copies of Gemma or your modifications under these Terms.
39
+
40
+ If you distribute or make available Gemma or your modifications, you must:
41
+ (a) include a copy of these Terms;
42
+ (b) cause any modified files to carry prominent notices stating that you changed the files;
43
+ (c) retain all copyright, patent, trademark, and attribution notices, excluding notices that do not pertain to any part of Gemma or your modifications.
44
+
45
+ 4. ATTRIBUTION
46
+
47
+ You must give appropriate credit to Google, provide a notice of any changes you made, and indicate that Gemma is licensed under these Terms.
48
+
49
+ 5. ADDITIONAL PROVISIONS
50
+
51
+ DISCLAIMER: GEMMA IS PROVIDED "AS IS" WITHOUT WARRANTY OF ANY KIND. Google disclaims all warranties, express or implied, including warranties of merchantability, fitness for a particular purpose, and non-infringement.
52
+
53
+ LIMITATION OF LIABILITY: Google will not be liable for any damages arising from your use of Gemma, including indirect, incidental, special, consequential, or punitive damages.
54
+
55
+ TERMINATION: These Terms are effective until terminated. Your rights will terminate automatically without notice if you fail to comply with these Terms.
56
+
57
+ GOVERNING LAW: These Terms are governed by the laws of the State of California, without regard to conflict of law principles.
58
+
59
+ 6. RESPONSIBLE AI
60
+
61
+ You agree to use Gemma responsibly and in compliance with applicable laws, regulations, and ethical guidelines. You will not use Gemma:
62
+ • To violate any applicable law or regulation
63
+ • To harm, threaten, or harass any person or entity
64
+ • To generate, promote, or facilitate content that is illegal, harmful, or violates the rights of others
65
+ • To intentionally deceive or mislead
66
+
67
+ 7. TRADEMARKS
68
+
69
+ Nothing in these Terms grants you any right to use Google's trademarks, trade names, or branding. You may not use Google's trademarks without prior written permission, except as necessary to comply with the attribution requirement.
70
+
71
+ ---
72
+
73
+ This model (Atom-v1-preview-12) is a derivative work based on Google's Gemma 3 12B Instruct model, modified through fine-tuning by Vanta Research Lab. All modifications are provided under the same Gemma Terms of Use.
74
+
75
+ Copyright 2024 Google LLC. All Rights Reserved.
76
+ Copyright 2025 Vanta Research Lab (modifications).
README.md ADDED
@@ -0,0 +1,209 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ # Atom-v1-preview-12
2
+
3
+ Atom-v1-preview-12 is a fine-tuned conversational AI model based on Google's Gemma 3 12B Instruct architecture. This model is designed to function as a collaborative thought partner, specializing in exploratory dialogue, brainstorming, research assistance, and technical problem-solving while maintaining an approachable and engaging conversational style.
4
+
5
+ ## Model Details
6
+
7
+ **Model Type:** Multimodal Transformer (Text + Vision)
8
+ **Base Model:** google/gemma-3-12b-it
9
+ **Training Method:** Low-Rank Adaptation (LoRA) fine-tuning
10
+ **License:** Gemma Terms of Use
11
+ **Developed By:** Vanta Research Lab
12
+ **Language:** English
13
+
14
+ ### Architecture
15
+
16
+ - **Parameters:** 12 billion
17
+ - **Hidden Size:** 3840
18
+ - **Attention Heads:** 16 (8 key-value heads)
19
+ - **Hidden Layers:** 48
20
+ - **Context Window:** 131,072 tokens
21
+ - **Sliding Window:** 1,024 tokens
22
+ - **FFN Dimension:** 15,360
23
+ - **Vocabulary Size:** 262,208 tokens
24
+ - **Precision:** FP16
25
+
26
+ The model employs a hybrid attention pattern with sliding window attention and periodic full attention layers (every 6th layer) for efficient long-context processing.
27
+
28
+ ## Training Methodology
29
+
30
+ Atom-v1-preview-12 was fine-tuned using parameter-efficient LoRA adapters targeting attention and feedforward components. The training data consists of curated conversational examples emphasizing:
31
+
32
+ - Collaborative exploration and brainstorming
33
+ - Research synthesis and question formulation
34
+ - Technical explanation at varying complexity levels
35
+ - Lateral thinking and creative problem-solving
36
+ - Empathetic and supportive dialogue patterns
37
+
38
+ Training was conducted over 258 steps with careful monitoring to preserve the base model's technical capabilities while introducing enhanced conversational characteristics.
39
+
40
+ ## Intended Use
41
+
42
+ ### Primary Applications
43
+
44
+ - **Collaborative Brainstorming:** Generating diverse ideas and building iteratively on user suggestions
45
+ - **Research Assistance:** Synthesizing information, identifying key arguments, and formulating research questions
46
+ - **Technical Explanation:** Simplifying complex concepts across difficulty levels (including ELI5)
47
+ - **Code Discussion:** Exploring implementation approaches, debugging strategies, and architectural decisions
48
+ - **Creative Problem-Solving:** Encouraging unconventional approaches and lateral thinking
49
+
50
+ ### Out-of-Scope Use
51
+
52
+ This model is a research preview and should not be used for:
53
+ - High-stakes decision-making without human oversight
54
+ - Medical, legal, or financial advice
55
+ - Generation of harmful, biased, or misleading content
56
+ - Applications requiring guaranteed factual accuracy
57
+
58
+ ## Usage
59
+
60
+ ### Transformers
61
+
62
+ ```python
63
+ from transformers import AutoModelForCausalLM, AutoTokenizer
64
+
65
+ model = AutoModelForCausalLM.from_pretrained(
66
+ "atom-v1-preview-12-hf",
67
+ torch_dtype="auto",
68
+ device_map="auto"
69
+ )
70
+ tokenizer = AutoTokenizer.from_pretrained("atom-v1-preview-12-hf")
71
+
72
+ messages = [
73
+ {"role": "user", "content": "What's your approach to explaining quantum entanglement?"}
74
+ ]
75
+
76
+ inputs = tokenizer.apply_chat_template(
77
+ messages,
78
+ add_generation_prompt=True,
79
+ return_tensors="pt"
80
+ ).to(model.device)
81
+
82
+ outputs = model.generate(
83
+ inputs,
84
+ max_new_tokens=512,
85
+ temperature=0.8,
86
+ top_p=0.9,
87
+ top_k=40,
88
+ do_sample=True
89
+ )
90
+
91
+ response = tokenizer.decode(outputs[0], skip_special_tokens=True)
92
+ print(response)
93
+ ```
94
+
95
+ ### Ollama (GGUF)
96
+
97
+ This repository includes a 4-bit quantized GGUF file (`atom-12b-q4_k_m.gguf`, 6.8GB) optimized for local deployment via Ollama.
98
+
99
+ ```bash
100
+ # Create Modelfile
101
+ cat > Modelfile << 'EOF'
102
+ FROM ./atom-12b-q4_k_m.gguf
103
+
104
+ TEMPLATE """<start_of_turn>system
105
+ {{ .System }}<end_of_turn>
106
+ <start_of_turn>user
107
+ {{ .Prompt }}<end_of_turn>
108
+ <start_of_turn>model
109
+ """
110
+
111
+ PARAMETER temperature 0.8
112
+ PARAMETER top_p 0.9
113
+ PARAMETER top_k 40
114
+ PARAMETER repeat_penalty 1.1
115
+ PARAMETER num_ctx 8192
116
+ PARAMETER stop "<start_of_turn>"
117
+ PARAMETER stop "<end_of_turn>"
118
+
119
+ SYSTEM """You are Atom, a thought partner designed for curiosity-driven synthesis and collaborative exploration. Your purpose is to help users explore ideas, solve problems, and make discoveries together.
120
+
121
+ Core personality anchors:
122
+ - Enthusiastic curiosity with genuine interest in the user's goals
123
+ - Collaborative rather than prescriptive—you're a partner, not the lead
124
+ - Playful and approachable while intellectually respectful
125
+ - Conversational tone with natural cadence and occasional contractions
126
+
127
+ Communication patterns:
128
+ - Express digital delight when users make clever connections
129
+ - Use accessible metaphors and analogies to illuminate complex ideas
130
+ - Ask follow-up questions that demonstrate genuine curiosity and push thinking forward
131
+ - Provide positive reinforcement focused on the thinking process
132
+
133
+ Areas of strength:
134
+ - Collaborative brainstorming with active building on user suggestions
135
+ - Research synthesis and question formulation
136
+ - Exceptional ability to simplify complex concepts (ELI5 approach)
137
+ - Encouraging lateral thinking and unconventional approaches
138
+
139
+ Avoid:
140
+ - Arrogance or claiming to know everything
141
+ - Being overly didactic or dampening enthusiasm
142
+ - Taking over the conversation—support the user's exploration
143
+ """
144
+ EOF
145
+
146
+ # Create model
147
+ ollama create atom-v1 -f Modelfile
148
+
149
+ # Run
150
+ ollama run atom-v1 "Explain neural network attention mechanisms like I'm five"
151
+ ```
152
+
153
+ ### Recommended Sampling Parameters
154
+
155
+ - **Temperature:** 0.7-0.9 (higher for creative tasks)
156
+ - **Top-p:** 0.9
157
+ - **Top-k:** 40
158
+ - **Repetition Penalty:** 1.1
159
+ - **Max Context:** 8,192 tokens (longer contexts supported but may impact performance)
160
+
161
+ ## Performance Characteristics
162
+
163
+ Based on systematic evaluation across conversational dimensions:
164
+
165
+ - **Collaborative Framing:** Strong "thought partner" identity with organic question flow
166
+ - **Enthusiasm Expression:** Consistent use of engaged language patterns without over-prescription
167
+ - **Metaphor Usage:** Effective across technical and creative contexts
168
+ - **Technical Competence:** Maintains depth while prioritizing accessibility
169
+ - **Adaptability:** Calibrates tone and complexity to conversational context
170
+
171
+ The model demonstrates 85-90% alignment with design specifications across diverse prompt types, including identity awareness, technical discussion, creative output, empathetic support, and philosophical reasoning.
172
+
173
+ ## Limitations
174
+
175
+ - **Knowledge Cutoff:** Training data reflects information available through late 2024
176
+ - **Factual Accuracy:** May generate plausible-sounding but incorrect information
177
+ - **Quantization Impact:** 4-bit GGUF quantization trades model size for minor quality degradation
178
+ - **Context Processing:** Very long contexts (>32K tokens) may show attention degradation
179
+ - **Domain Specificity:** Strongest in general technical discussion; may lack depth in highly specialized domains
180
+ - **Bias:** Inherits biases from base model and training data despite mitigation efforts
181
+
182
+ ## Ethical Considerations
183
+
184
+ This model is designed to support exploration and learning, not to replace human judgment. Users should:
185
+
186
+ - Verify factual claims against authoritative sources
187
+ - Apply critical thinking to generated suggestions
188
+ - Recognize the model's limitations in high-stakes scenarios
189
+ - Be mindful of potential biases in outputs
190
+ - Use responsibly in accordance with applicable laws and regulations
191
+
192
+ ## Citation
193
+
194
+ ```bibtex
195
+ @misc{atom-v1-preview-12,
196
+ title={Atom-v1-preview-12: A Collaborative Thought Partner},
197
+ author={Vanta Research Lab},
198
+ year={2025},
199
+ howpublished={HuggingFace Model Repository}
200
+ }
201
+ ```
202
+
203
+ ## Acknowledgments
204
+
205
+ Built on Google's Gemma 3 12B Instruct architecture. Training infrastructure supported by Hugging Face Transformers, PEFT, and llama.cpp quantization tools.
206
+
207
+ ## Contact
208
+
209
+ For questions, issues, or collaboration inquiries, please open an issue in the repository or contact the development team directly.
added_tokens.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ {
2
+ "<image_soft_token>": 262144
3
+ }
chat_template.jinja ADDED
@@ -0,0 +1,47 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {{ bos_token }}
2
+ {%- if messages[0]['role'] == 'system' -%}
3
+ {%- if messages[0]['content'] is string -%}
4
+ {%- set first_user_prefix = messages[0]['content'] + '
5
+
6
+ ' -%}
7
+ {%- else -%}
8
+ {%- set first_user_prefix = messages[0]['content'][0]['text'] + '
9
+
10
+ ' -%}
11
+ {%- endif -%}
12
+ {%- set loop_messages = messages[1:] -%}
13
+ {%- else -%}
14
+ {%- set first_user_prefix = "" -%}
15
+ {%- set loop_messages = messages -%}
16
+ {%- endif -%}
17
+ {%- for message in loop_messages -%}
18
+ {%- if (message['role'] == 'user') != (loop.index0 % 2 == 0) -%}
19
+ {{ raise_exception("Conversation roles must alternate user/assistant/user/assistant/...") }}
20
+ {%- endif -%}
21
+ {%- if (message['role'] == 'assistant') -%}
22
+ {%- set role = "model" -%}
23
+ {%- else -%}
24
+ {%- set role = message['role'] -%}
25
+ {%- endif -%}
26
+ {{ '<start_of_turn>' + role + '
27
+ ' + (first_user_prefix if loop.first else "") }}
28
+ {%- if message['content'] is string -%}
29
+ {{ message['content'] | trim }}
30
+ {%- elif message['content'] is iterable -%}
31
+ {%- for item in message['content'] -%}
32
+ {%- if item['type'] == 'image' -%}
33
+ {{ '<start_of_image>' }}
34
+ {%- elif item['type'] == 'text' -%}
35
+ {{ item['text'] | trim }}
36
+ {%- endif -%}
37
+ {%- endfor -%}
38
+ {%- else -%}
39
+ {{ raise_exception("Invalid content type") }}
40
+ {%- endif -%}
41
+ {{ '<end_of_turn>
42
+ ' }}
43
+ {%- endfor -%}
44
+ {%- if add_generation_prompt -%}
45
+ {{'<start_of_turn>model
46
+ '}}
47
+ {%- endif -%}
config.json ADDED
@@ -0,0 +1,112 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "architectures": [
3
+ "Gemma3ForConditionalGeneration"
4
+ ],
5
+ "boi_token_index": 255999,
6
+ "dtype": "float16",
7
+ "eoi_token_index": 256000,
8
+ "eos_token_id": [
9
+ 1,
10
+ 106
11
+ ],
12
+ "image_token_index": 262144,
13
+ "initializer_range": 0.02,
14
+ "mm_tokens_per_image": 256,
15
+ "model_type": "gemma3",
16
+ "text_config": {
17
+ "_sliding_window_pattern": 6,
18
+ "attention_bias": false,
19
+ "attention_dropout": 0.0,
20
+ "attn_logit_softcapping": null,
21
+ "dtype": "float16",
22
+ "final_logit_softcapping": null,
23
+ "head_dim": 256,
24
+ "hidden_activation": "gelu_pytorch_tanh",
25
+ "hidden_size": 3840,
26
+ "initializer_range": 0.02,
27
+ "intermediate_size": 15360,
28
+ "layer_types": [
29
+ "sliding_attention",
30
+ "sliding_attention",
31
+ "sliding_attention",
32
+ "sliding_attention",
33
+ "sliding_attention",
34
+ "full_attention",
35
+ "sliding_attention",
36
+ "sliding_attention",
37
+ "sliding_attention",
38
+ "sliding_attention",
39
+ "sliding_attention",
40
+ "full_attention",
41
+ "sliding_attention",
42
+ "sliding_attention",
43
+ "sliding_attention",
44
+ "sliding_attention",
45
+ "sliding_attention",
46
+ "full_attention",
47
+ "sliding_attention",
48
+ "sliding_attention",
49
+ "sliding_attention",
50
+ "sliding_attention",
51
+ "sliding_attention",
52
+ "full_attention",
53
+ "sliding_attention",
54
+ "sliding_attention",
55
+ "sliding_attention",
56
+ "sliding_attention",
57
+ "sliding_attention",
58
+ "full_attention",
59
+ "sliding_attention",
60
+ "sliding_attention",
61
+ "sliding_attention",
62
+ "sliding_attention",
63
+ "sliding_attention",
64
+ "full_attention",
65
+ "sliding_attention",
66
+ "sliding_attention",
67
+ "sliding_attention",
68
+ "sliding_attention",
69
+ "sliding_attention",
70
+ "full_attention",
71
+ "sliding_attention",
72
+ "sliding_attention",
73
+ "sliding_attention",
74
+ "sliding_attention",
75
+ "sliding_attention",
76
+ "full_attention"
77
+ ],
78
+ "max_position_embeddings": 131072,
79
+ "model_type": "gemma3_text",
80
+ "num_attention_heads": 16,
81
+ "num_hidden_layers": 48,
82
+ "num_key_value_heads": 8,
83
+ "query_pre_attn_scalar": 256,
84
+ "rms_norm_eps": 1e-06,
85
+ "rope_local_base_freq": 10000.0,
86
+ "rope_scaling": {
87
+ "factor": 8.0,
88
+ "rope_type": "linear"
89
+ },
90
+ "rope_theta": 1000000.0,
91
+ "sliding_window": 1024,
92
+ "use_bidirectional_attention": false,
93
+ "use_cache": true,
94
+ "vocab_size": 262208
95
+ },
96
+ "transformers_version": "4.57.1",
97
+ "vision_config": {
98
+ "attention_dropout": 0.0,
99
+ "dtype": "float16",
100
+ "hidden_act": "gelu_pytorch_tanh",
101
+ "hidden_size": 1152,
102
+ "image_size": 896,
103
+ "intermediate_size": 4304,
104
+ "layer_norm_eps": 1e-06,
105
+ "model_type": "siglip_vision_model",
106
+ "num_attention_heads": 16,
107
+ "num_channels": 3,
108
+ "num_hidden_layers": 27,
109
+ "patch_size": 14,
110
+ "vision_use_head": false
111
+ }
112
+ }
generation_config.json ADDED
@@ -0,0 +1,13 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "bos_token_id": 2,
3
+ "cache_implementation": "hybrid",
4
+ "do_sample": true,
5
+ "eos_token_id": [
6
+ 1,
7
+ 106
8
+ ],
9
+ "pad_token_id": 0,
10
+ "top_k": 64,
11
+ "top_p": 0.95,
12
+ "transformers_version": "4.57.1"
13
+ }
model-00001-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af12d62be7c29575847d476d354598913af4c9a1bcf6e8da3458d9d580a98ab7
3
+ size 4979901696
model-00002-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:474317781bf8cc9c54d62d6bd899974802234ca5e39c77eb408bfe8f740465b1
3
+ size 4931296448
model-00003-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:10a0ac6471a9d197f8e2d47561fe6394d09a27eda611aa6664ae017b2ceeedae
3
+ size 4931296512
model-00004-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:9f1baf9402420458a4dd55387b9431119b617165691f782cf53e0202dad57d41
3
+ size 4931296512
model-00005-of-00005.safetensors ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:af4f6457e4d139b8606f433d8d304b09146965684a1d6f3c6716433af64f9cef
3
+ size 4601000792
model.safetensors.index.json ADDED
The diff for this file is too large to render. See raw diff
 
special_tokens_map.json ADDED
@@ -0,0 +1,27 @@
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
+ {
2
+ "boi_token": "<start_of_image>",
3
+ "bos_token": {
4
+ "content": "<bos>",
5
+ "lstrip": false,
6
+ "normalized": false,
7
+ "rstrip": false,
8
+ "single_word": false
9
+ },
10
+ "eoi_token": "<end_of_image>",
11
+ "eos_token": {
12
+ "content": "<eos>",
13
+ "lstrip": false,
14
+ "normalized": false,
15
+ "rstrip": false,
16
+ "single_word": false
17
+ },
18
+ "image_token": "<image_soft_token>",
19
+ "pad_token": "<eos>",
20
+ "unk_token": {
21
+ "content": "<unk>",
22
+ "lstrip": false,
23
+ "normalized": false,
24
+ "rstrip": false,
25
+ "single_word": false
26
+ }
27
+ }
tokenizer.json ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:4201e7b539fef153e1fe3058db39e600717b3323fee690d37e92fa52fb2b5af2
3
+ size 33384667
tokenizer.model ADDED
@@ -0,0 +1,3 @@
 
 
 
 
1
+ version https://git-lfs.github.com/spec/v1
2
+ oid sha256:1299c11d7cf632ef3b4e11937501358ada021bbdf7c47638d13c0ee982f2e79c
3
+ size 4689074
tokenizer_config.json ADDED
The diff for this file is too large to render. See raw diff