build_chat_prompt       Build chat prompt from conversation history
edge_ask                Ask a question using retrieval-augmented
                        generation
edge_benchmark          Performance benchmarking for model inference
edge_cache_info         Cache size information
edge_chat_completion    Generate a chat completion using the model's
                        native template
edge_chat_stream        Interactive chat session with streaming
                        responses
edge_classify           Classify text into predefined categories
edge_clean_cache        Clean up cache directory and manage storage
edge_completion         Generate text completion using loaded model
edge_cuda_info          Check whether a CUDA backend is installed and
                        active
edge_download_model     Download a GGUF model from Hugging Face
edge_download_url       Download a model from a direct URL
edge_embeddings         Extract text embeddings from a model
edge_extract            Extract structured data from text
edge_extract_batch      Extract structured data from multiple texts
edge_find_gguf_models   Find and prepare GGUF models for use with
                        edgemodelr
edge_find_ollama_models
                        Find and load Ollama models
edge_free_model         Free model context and release memory
edge_grammar_completion
                        Generate text constrained by a GBNF grammar
edge_index_documents    Build an embedding index from text documents
edge_install_cuda       Install the CUDA backend for GPU-accelerated
                        inference
edge_install_cuda_toolkit
                        Install CUDA runtime libraries required for GPU
                        inference
edge_json_grammar       Generate a GBNF grammar for JSON output from a
                        schema
edge_list_models        List popular pre-configured models
edge_load_model         Load a local GGUF model for inference
edge_load_ollama_model
                        Load an Ollama model by partial SHA-256 hash
edge_map                Apply a prompt template to a vector of texts
edge_model_n_embd       Get the embedding dimension of a loaded model
edge_quick_setup        Quick setup for a popular model
edge_reload_cuda        Activate an installed CUDA backend without
                        restarting R
edge_search             Search an embedding index for relevant chunks
edge_serve              Serve a model as a local OpenAI-compatible API
edge_set_verbose        Control llama.cpp logging verbosity
edge_simd_info          Query SIMD optimization status
edge_similarity         Compute cosine similarity between two embedding
                        vectors
edge_similarity_matrix
                        Compute a similarity matrix for a set of
                        embeddings
edge_small_model_config
                        Get optimized configuration for small language
                        models
edge_stream_completion
                        Stream text completion with real-time token
                        generation
is_valid_model          Check if model context is valid
test_ollama_model_compatibility
                        Test if an Ollama model blob can be used with
                        edgemodelr
