Ary-007 commited on
Commit
1ac2713
·
verified ·
1 Parent(s): 4716bd4

Update README.md

Browse files
Files changed (1) hide show
  1. README.md +29 -1
README.md CHANGED
@@ -82,4 +82,32 @@ outputs = pipe(
82
 
83
  print(outputs[0]["generated_text"])
84
  ```
85
-
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
82
 
83
  print(outputs[0]["generated_text"])
84
  ```
85
+ ## Training Details
86
+
87
+ The model was fine-tuned using Unsloth on a Tesla T4 GPU (Google Colab).
88
+
89
+ Hyperparameters
90
+ 1) Rank (r): 16
91
+ 2) LoRA Alpha: 16
92
+ 3) Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
93
+ 4) Quantization: 4-bit (Normal Float4)
94
+ 5) Max Sequence Length: 2048
95
+ 6) Learning Rate: 2e-4
96
+ 7) Optim: adamw_8bit
97
+ 8) Max Steps: 60
98
+
99
+
100
+ ## Dataset Info
101
+ The model was trained on the gretelai/synthetic_text_to_sql dataset, utilizing the following fields:
102
+ 1) sql_context: Used as the database schema context.
103
+ 2) sql_prompt: The natural language question.
104
+ 3) sql: The target SQL query.
105
+ 4) sql_explanation: The explanation of the query logic.
106
+
107
+ ## Limitations
108
+ 1) Training Steps: This model was trained for a limited number of steps (60) as a proof of concept. It may not generalize well to extremely complex or unseen database schemas.
109
+ 2) Hallucination: Like all LLMs, it may generate syntactically correct but logically incorrect SQL. Always validate the output before running it on a production database.
110
+ 3) Scope: It is optimized for standard SQL (similar to SQLite/PostgreSQL) as presented in the GretelAI dataset.
111
+
112
+ ## License
113
+ This model is derived from Llama-3.2 and is subject to the Llama 3.2 Community License.