Update README.md
Browse files
README.md
CHANGED
|
@@ -82,4 +82,32 @@ outputs = pipe(
|
|
| 82 |
|
| 83 |
print(outputs[0]["generated_text"])
|
| 84 |
```
|
| 85 |
-
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| 82 |
|
| 83 |
print(outputs[0]["generated_text"])
|
| 84 |
```
|
| 85 |
+
## Training Details
|
| 86 |
+
|
| 87 |
+
The model was fine-tuned using Unsloth on a Tesla T4 GPU (Google Colab).
|
| 88 |
+
|
| 89 |
+
Hyperparameters
|
| 90 |
+
1) Rank (r): 16
|
| 91 |
+
2) LoRA Alpha: 16
|
| 92 |
+
3) Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
|
| 93 |
+
4) Quantization: 4-bit (Normal Float4)
|
| 94 |
+
5) Max Sequence Length: 2048
|
| 95 |
+
6) Learning Rate: 2e-4
|
| 96 |
+
7) Optim: adamw_8bit
|
| 97 |
+
8) Max Steps: 60
|
| 98 |
+
|
| 99 |
+
|
| 100 |
+
## Dataset Info
|
| 101 |
+
The model was trained on the gretelai/synthetic_text_to_sql dataset, utilizing the following fields:
|
| 102 |
+
1) sql_context: Used as the database schema context.
|
| 103 |
+
2) sql_prompt: The natural language question.
|
| 104 |
+
3) sql: The target SQL query.
|
| 105 |
+
4) sql_explanation: The explanation of the query logic.
|
| 106 |
+
|
| 107 |
+
## Limitations
|
| 108 |
+
1) Training Steps: This model was trained for a limited number of steps (60) as a proof of concept. It may not generalize well to extremely complex or unseen database schemas.
|
| 109 |
+
2) Hallucination: Like all LLMs, it may generate syntactically correct but logically incorrect SQL. Always validate the output before running it on a production database.
|
| 110 |
+
3) Scope: It is optimized for standard SQL (similar to SQLite/PostgreSQL) as presented in the GretelAI dataset.
|
| 111 |
+
|
| 112 |
+
## License
|
| 113 |
+
This model is derived from Llama-3.2 and is subject to the Llama 3.2 Community License.
|