Spaces:

MCP-1st-Birthday
/

Jobly

Running

App Files Files Community

Jobly / RAG_ARCHITECTURE.md

Valentina9502

First commit

fdf5af0 verified 20 days ago

preview code

raw

history blame contribute delete

10.4 kB

	# 🧠 RAG Architecture & Vector Embeddings

	## Overview

	GigMatch AI uses Retrieval-Augmented Generation (RAG) with vector embeddings to perform intelligent semantic matching between workers and gigs. This goes far beyond simple keyword matching!

	## 🏗️ Architecture

	```
	┌─────────────────────────────────────────────────────────────┐
	│ DATA INGESTION │
	├─────────────────────────────────────────────────────────────┤
	│ 50 Workers + 50 Gigs (JSON) │
	│ ↓ │
	│ Text Enrichment (skills, bio, location, etc.) │
	│ ↓ │
	│ HuggingFace Embeddings (all-MiniLM-L6-v2) │
	│ ↓ │
	│ Vector Storage (ChromaDB) │
	└─────────────────────────────────────────────────────────────┘

	┌─────────────────────────────────────────────────────────────┐
	│ QUERY PIPELINE │
	├─────────────────────────────────────────────────────────────┤
	│ User Query (worker profile or gig post) │
	│ ↓ │
	│ Convert to Search Query │
	│ ↓ │
	│ Embed Query (HuggingFace) │
	│ ↓ │
	│ Semantic Search (Vector Similarity) │
	│ ↓ │
	│ Retrieve Top K Results │
	│ ↓ │
	│ Calculate Match Scores │
	│ ↓ │
	│ Return Results to Agent │
	└─────────────────────────────────────────────────────────────┘
	```

	## 🦙 LlamaIndex Integration

	### Why LlamaIndex?

	1. Sponsor Recognition - LlamaIndex is a hackathon sponsor 🎉
	2. Production-Ready - Battle-tested RAG framework
	3. Easy Integration - Simple API for vector operations
	4. Flexible - Supports multiple vector stores and embeddings

	### Implementation

	```python
	from llama_index.core import VectorStoreIndex, Document
	from llama_index.embeddings.huggingface import HuggingFaceEmbedding
	from llama_index.vector_stores.chroma import ChromaVectorStore

	# Initialize embedding model
	embed_model = HuggingFaceEmbedding(
	model_name="sentence-transformers/all-MiniLM-L6-v2"
	)

	# Create documents with rich text
	worker_doc = Document(
	text=f"Name: {name}, Skills: {skills}, Location: {location}...",
	metadata=worker_data
	)

	# Create vector index
	index = VectorStoreIndex.from_documents(
	documents,
	vector_store=vector_store
	)

	# Query
	query_engine = index.as_query_engine(similarity_top_k=5)
	response = query_engine.query("Looking for plumber in Rome...")
	```

	## 🤗 HuggingFace Embeddings

	### Model: all-MiniLM-L6-v2

	Why this model?
	- ✅ Fast inference (only 23M parameters)
	- ✅ Good quality embeddings (384 dimensions)
	- ✅ Pre-trained on semantic similarity
	- ✅ HuggingFace sponsor recognition 🤗

	Performance:
	- Embedding time: ~20ms per text
	- Vector size: 384 dimensions
	- Cosine similarity for matching

	### How Embeddings Work

	1. Text → Vector: Each worker/gig is converted to a 384-dimensional vector
	2. Semantic Meaning: Similar meanings = similar vectors
	3. Cosine Similarity: Measure angle between vectors (0-1 score)
	4. Top K: Return K most similar vectors

	Example:
	```python
	text1 = "Experienced plumber, pipe repair, Rome"
	text2 = "Looking for plumbing services, leak fix, Rome"

	# After embedding:
	vec1 = [0.23, -0.45, 0.67, ...] # 384 dimensions
	vec2 = [0.21, -0.43, 0.69, ...] # 384 dimensions

	# Cosine similarity: 0.94 (very similar!)
	```

	## 📊 ChromaDB Vector Store

	### Why ChromaDB?

	- ✅ Simple local setup (no server needed)
	- ✅ Fast vector search
	- ✅ Native Python API
	- ✅ Persistence support
	- ✅ Perfect for demo/hackathon

	### Collections

	Workers Collection:
	- 50 worker profiles
	- Indexed by skills, experience, location
	- Searchable by semantic similarity

	Gigs Collection:
	- 50 gig posts
	- Indexed by requirements, project details
	- Searchable by semantic similarity

	## 🎯 Semantic Matching Algorithm

	### Traditional Keyword Matching (OLD)
	```python
	# Problem: Only finds exact keyword matches
	if "plumbing" in worker_skills and "plumbing" in gig_requirements:
	score += 1 # Match!
	```

	### Semantic Matching with RAG (NEW)
	```python
	# Solution: Understands meaning and context

	Query: "Need someone to fix leaking pipes"
	Embedding: [0.23, -0.45, 0.67, ...]

	Worker 1: "Plumber, pipe repair specialist"
	Embedding: [0.21, -0.43, 0.69, ...]
	Similarity: 0.94 ← HIGH MATCH!

	Worker 2: "Electrician, wiring expert"
	Embedding: [-0.11, 0.52, -0.33, ...]
	Similarity: 0.12 ← LOW MATCH

	# Semantic search finds Worker 1 even though
	# the word "plumbing" wasn't explicitly mentioned!
	```

	### Advantages

	1. Synonym Understanding: "plumber" ≈ "pipe specialist"
	2. Context Awareness: "fix pipes" ≈ "repair plumbing"
	3. Related Concepts: "garden" ≈ "landscaping" ≈ "outdoor"
	4. Multi-language: Can handle slight variations
	5. Fuzzy Matching: Typos and variations still work

	## 🔬 Match Score Calculation

	### Components

	1. Semantic Similarity (70% weight)
	- Cosine similarity from vector embeddings
	- Range: 0.0 to 1.0
	- Higher = better semantic match

	2. Keyword Overlap (20% weight)
	- Exact skill matches
	- Experience level alignment
	- Calculated as: matched_skills / required_skills

	3. Location Match (10% weight)
	- Geographic proximity
	- Remote work consideration
	- Binary: 1.0 (same location/remote) or 0.5 (different)

	### Final Formula

	```python
	semantic_score = cosine_similarity(query_vec, doc_vec)
	keyword_score = len(matched_skills) / len(required_skills)
	location_score = 1.0 if location_match else 0.5

	final_score = (
	semantic_score * 0.7 +
	keyword_score * 0.2 +
	location_score * 0.1
	) * 100 # Convert to 0-100 scale
	```

	## 📈 Performance & Scalability

	### Current Setup (Demo)
	- 50 workers + 50 gigs = 100 vectors
	- Average query time: ~100ms
	- Embedding model loaded in memory: ~100MB
	- Total memory usage: ~200MB

	### Production Scaling

	For 10,000 entries:
	- ✅ Still fast (<500ms per query)
	- ✅ ChromaDB handles easily
	- ✅ Consider batch embedding for ingestion

	For 100,000+ entries:
	- Use hosted vector DB (Pinecone, Weaviate)
	- Batch processing for embeddings
	- Caching layer for frequent queries
	- GPU acceleration for embedding

	## 🎨 Benefits for the Hackathon

	### Why This is WOW

	1. Not Just LLM Calls: Real vector database with semantic search
	2. Sponsor Integration: LlamaIndex 🦙 + HuggingFace 🤗
	3. Production Patterns: Proper RAG architecture
	4. Scalable: Easy to extend to 1000s of entries
	5. Explainable: Can show similarity scores

	### Demo Impact

	Judges will see:
	- ✅ "Powered by LlamaIndex + HuggingFace"
	- ✅ Semantic similarity scores in results
	- ✅ Better matches than keyword search
	- ✅ 100 entries in vector database
	- ✅ Real-time vector search

	## 🔮 Future Enhancements

	### Easy Wins
	- [ ] Add filters (location, budget, experience)
	- [ ] Implement hybrid search (semantic + keyword)
	- [ ] Add reranking with cross-encoders
	- [ ] Cache popular queries

	### Advanced
	- [ ] Fine-tune embedding model on gig data
	- [ ] Multi-modal embeddings (add images)
	- [ ] Graph relationships between skills
	- [ ] Temporal embeddings (availability matching)

	## 📚 Code Examples

	### Creating the Index

	```python
	# 1. Load data
	workers = load_workers_from_json()

	# 2. Create documents
	documents = []
	for worker in workers:
	text = f"""
	Name: {worker['name']}
	Skills: {', '.join(worker['skills'])}
	Experience: {worker['experience']}
	Location: {worker['location']}
	"""
	doc = Document(text=text, metadata=worker)
	documents.append(doc)

	# 3. Create vector store
	chroma_collection = chroma_client.create_collection("workers")
	vector_store = ChromaVectorStore(chroma_collection=chroma_collection)

	# 4. Build index
	index = VectorStoreIndex.from_documents(
	documents,
	vector_store=vector_store
	)
	```

	### Querying the Index

	```python
	# 1. Create query
	query = f"""
	Looking for: {', '.join(required_skills)}
	Location: {location}
	Experience: {experience_level}
	"""

	# 2. Get query engine
	query_engine = index.as_query_engine(similarity_top_k=5)

	# 3. Execute query
	response = query_engine.query(query)

	# 4. Extract results
	for node in response.source_nodes:
	worker_data = node.metadata
	similarity_score = node.score
	print(f"Match: {worker_data['name']}, Score: {similarity_score}")
	```

	## 🎯 Key Takeaways

	1. RAG = Better Matches: Semantic understanding > keyword matching
	2. LlamaIndex = Easy: Production RAG in <100 lines of code
	3. HuggingFace = Quality: Great embeddings, sponsor recognition
	4. ChromaDB = Fast: Local vector store, perfect for demo
	5. Scalable = Future-proof: Architecture works at scale

	---

	This is what makes GigMatch AI stand out in the hackathon! 🚀