Local Large Language Model Inference Engine


[Up] [Top]

Documentation for package ‘edgemodelr’ version 0.4.1

Help Pages

build_chat_prompt Build chat prompt from conversation history
edge_ask Ask a question using retrieval-augmented generation
edge_benchmark Performance benchmarking for model inference
edge_cache_info Cache size information
edge_chat_completion Generate a chat completion using the model's native template
edge_chat_stream Interactive chat session with streaming responses
edge_classify Classify text into predefined categories
edge_clean_cache Clean up cache directory and manage storage
edge_completion Generate text completion using loaded model
edge_cuda_info Check whether a CUDA backend is installed and active
edge_download_model Download a GGUF model from Hugging Face
edge_download_url Download a model from a direct URL
edge_embeddings Extract text embeddings from a model
edge_extract Extract structured data from text
edge_extract_batch Extract structured data from multiple texts
edge_find_gguf_models Find and prepare GGUF models for use with edgemodelr
edge_find_ollama_models Find and load Ollama models
edge_free_model Free model context and release memory
edge_grammar_completion Generate text constrained by a GBNF grammar
edge_index_documents Build an embedding index from text documents
edge_install_cuda Install the CUDA backend for GPU-accelerated inference
edge_install_cuda_toolkit Install CUDA runtime libraries required for GPU inference
edge_json_grammar Generate a GBNF grammar for JSON output from a schema
edge_list_models List popular pre-configured models
edge_load_model Load a local GGUF model for inference
edge_load_ollama_model Load an Ollama model by partial SHA-256 hash
edge_map Apply a prompt template to a vector of texts
edge_model_n_embd Get the embedding dimension of a loaded model
edge_quick_setup Quick setup for a popular model
edge_reload_cuda Activate an installed CUDA backend without restarting R
edge_search Search an embedding index for relevant chunks
edge_serve Serve a model as a local OpenAI-compatible API
edge_set_verbose Control llama.cpp logging verbosity
edge_simd_info Query SIMD optimization status
edge_similarity Compute cosine similarity between two embedding vectors
edge_similarity_matrix Compute a similarity matrix for a set of embeddings
edge_small_model_config Get optimized configuration for small language models
edge_stream_completion Stream text completion with real-time token generation
is_valid_model Check if model context is valid
test_ollama_model_compatibility Test if an Ollama model blob can be used with edgemodelr