MLTrainingService.swift
MLTrainingService Documentation
Overview
MLTrainingService.swift
provides text embedding generation functionality using Apple's Natural Language framework. The service utilizes pre-trained word embeddings to convert text into numerical vector representations, which can be used for similarity matching and text analysis.
Core Components
TrainingData Structure
text
String
The input text to be processed
category
String?
Optional category label
metadata
[String: Any]?
Optional additional information
TrainingConfig Structure
dimension
Int
Size of the embedding vectors
windowSize
Int
Context window for word embeddings
Main Features
Initialization
Creates a new MLTrainingService instance
Loads Apple's pre-trained word embeddings for English
Uses default configuration if not specified
Data Management
Methods to add single or multiple training data points
Stores data for potential future use or analysis
Embedding Generation
Tokenizes input text into words
Generates embeddings for each word
Averages word vectors to create text embedding
Returns nil if embedding generation fails
Async Support
Asynchronous versions of embedding generation
Supports batch processing of multiple texts
Uses Swift concurrency for efficient processing
Error Handling
insufficientData
Not enough data for processing
embeddingGenerationFailed
Failed to generate embeddings
Usage Examples
Basic Usage
Async Usage
Custom Configuration
Best Practices
Memory Management
Be mindful of batch sizes when processing multiple texts
Consider memory usage when storing training data
Performance
Use async methods for better performance with large datasets
Process texts in batches when possible
Error Handling
Always check for nil results when generating embeddings
Implement appropriate error handling for your use case
Dependencies
Foundation
NaturalLanguage
CoreML
Related Components
EntryModel: Uses embeddings for entry processing
ClusterModel: Uses embeddings for similarity matching
Storage layer: Persists generated embeddings
Last updated