Ary-007
/

Text-to-sql-llama-3.2

Text Generation

text-generation-inference

Model card Files Files and versions

Ary-007 commited on 17 days ago

Commit

1ac2713

·

verified ·

1 Parent(s): 4716bd4

Update README.md

Files changed (1) hide show

README.md +29 -1

README.md CHANGED Viewed

@@ -82,4 +82,32 @@ outputs = pipe(
 print(outputs[0]["generated_text"])
 ```

 print(outputs[0]["generated_text"])
 ```
+## Training Details
+The model was fine-tuned using Unsloth on a Tesla T4 GPU (Google Colab).
+Hyperparameters
+1) Rank (r): 16
+2) LoRA Alpha: 16
+3) Target Modules: q_proj, k_proj, v_proj, o_proj, gate_proj, up_proj, down_proj
+4) Quantization: 4-bit (Normal Float4)
+5) Max Sequence Length: 2048
+6) Learning Rate: 2e-4
+7) Optim: adamw_8bit
+8) Max Steps: 60
+## Dataset Info
+The model was trained on the gretelai/synthetic_text_to_sql dataset, utilizing the following fields:
+1) sql_context: Used as the database schema context.
+2) sql_prompt: The natural language question.
+3) sql: The target SQL query.
+4) sql_explanation: The explanation of the query logic.
+## Limitations
+1) Training Steps: This model was trained for a limited number of steps (60) as a proof of concept. It may not generalize well to extremely complex or unseen database schemas.
+2) Hallucination: Like all LLMs, it may generate syntactically correct but logically incorrect SQL. Always validate the output before running it on a production database.
+3) Scope: It is optimized for standard SQL (similar to SQLite/PostgreSQL) as presented in the GretelAI dataset.
+## License
+This model is derived from Llama-3.2 and is subject to the Llama 3.2 Community License.