LocalAI - Models

noromaid-13b-0.4-DPO

Links

https://huggingface.co/NeverSleep/Noromaid-13B-0.4-DPO-GGUF

Tags

llava-1.6-vicuna

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

Links

https://llava-vl.github.io/

Tags

llava-1.6-mistral

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

Links

https://llava-vl.github.io/

Tags

llava-1.5

LLaVA represents a novel end-to-end trained large multimodal model that combines a vision encoder and Vicuna for general-purpose visual and language understanding, achieving impressive chat capabilities mimicking spirits of the multimodal GPT-4 and setting a new state-of-the-art accuracy on Science QA.

Links

https://llava-vl.github.io/

Tags

phi-2-chat:Q8_0

Phi-2 fine-tuned by the OpenHermes 2.5 dataset optimised for multi-turn conversation and character impersonation. The dataset has been pre-processed by doing the following: - remove all refusals - remove any mention of AI assistant - split any multi-turn dialog generated in the dataset into multi-turn conversations records - added nfsw generated conversations from the Teatime dataset Developed by: l3utterfly Funded by: Layla Network Model type: Phi Language(s) (NLP): English License: MIT Finetuned from model: Phi-2

Links

Tags

phi-2-chat

Phi-2 fine-tuned by the OpenHermes 2.5 dataset optimised for multi-turn conversation and character impersonation. The dataset has been pre-processed by doing the following: - remove all refusals - remove any mention of AI assistant - split any multi-turn dialog generated in the dataset into multi-turn conversations records - added nfsw generated conversations from the Teatime dataset Developed by: l3utterfly Funded by: Layla Network Model type: Phi Language(s) (NLP): English License: MIT Finetuned from model: Phi-2

Links

Tags

phi-2-orange

A two-step finetune of Phi-2, with a bit of zest. There is an updated model at rhysjones/phi-2-orange-v2 which has higher evals, if you wish to test.

Links

Tags

phi-3-mini-4k-instruct

The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) it can support. The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.

Links

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Tags

phi-3-mini-4k-instruct:fp16

The Phi-3-Mini-4K-Instruct is a 3.8B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Mini version in two variants 4K and 128K which is the context length (in tokens) it can support. The model has underwent a post-training process that incorporates both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures. When assessed against benchmarks testing common sense, language understanding, math, code, long context and logical reasoning, Phi-3 Mini-4K-Instruct showcased a robust and state-of-the-art performance among models with less than 13 billion parameters.

Links

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Tags

phi-3-medium-4k-instruct

The Phi-3-Medium-4K-Instruct is a 14B parameters, lightweight, state-of-the-art open model trained with the Phi-3 datasets that includes both synthetic data and the filtered publicly available websites data with a focus on high-quality and reasoning dense properties. The model belongs to the Phi-3 family with the Medium version in two variants 4K and 128K which is the context length (in tokens) that it can support.

Links

Tags

cream-phi-3-14b-v1

CreamPhi 14B is the first Phi Medium to be trained with roleplay and moist.

Links

https://huggingface.co/TheDrummer/Cream-Phi-3-14B-v1-GGUF

Tags

phi3-4x4b-v1

a continually pretrained phi3-mini sparse moe upcycle

Links

Tags

phi-3.1-mini-4k-instruct

This is an update over the original instruction-tuned Phi-3-mini release based on valuable customer feedback. The model used additional post-training data leading to substantial gains on instruction following and structure output. It is based on the original model from Microsoft, but has been updated and quantized using the llama.cpp release b3278.

Links

Tags

phillama-3.8b-v0.1

The description of the LLM model is: Phillama is a model based on Phi-3-mini and trained on Llama-generated dataset raincandy-u/Dextromethorphan-10k to make it more "llama-like". Also, this model is converted into Llama format, so it will work with any Llama-2/3 workflow. The model aims to generate text with a specific "llama-like" style and is suited for text-generation tasks.

Links

https://huggingface.co/RichardErkhov/raincandy-u_-_phillama-3.8b-v0.1-gguf

Tags

calme-2.3-phi3-4b

MaziyarPanahi/calme-2.1-phi3-4b This model is a fine-tune (DPO) of microsoft/Phi-3-mini-4k-instruct model.

Links

Tags

phi-3.5-mini-instruct

Phi-3.5-mini is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data. The model belongs to the Phi-3 model family and supports 128K token context length. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.

Links

Tags

calme-2.1-phi3.5-4b-i1

This model is a fine-tuned version of the microsoft/Phi-3.5-mini-instruct, pushing the boundaries of natural language understanding and generation even further. My goal was to create a versatile and robust model that excels across a wide range of benchmarks and real-world applications.

Links

Tags

phi-3.5-mini-titanfusion-0.2

This model was merged using the TIES merge method using microsoft/Phi-3.5-mini-instruct as a base. The following models were included in the merge: nbeerbower/phi3.5-gutenberg-4B ArliAI/Phi-3.5-mini-3.8B-ArliAI-RPMax-v1.1 bunnycore/Phi-3.5-Mini-Hyper bunnycore/Phi-3.5-Mini-Hyper + bunnycore/Phi-3.1-EvolKit-lora bunnycore/Phi-3.5-Mini-Sonet-RP bunnycore/Phi-3.5-mini-TitanFusion-0.1

Links

Tags

phi-3-vision:vllm

Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

Links

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Tags

phi-3.5-vision:vllm

Phi-3.5-vision is a lightweight, state-of-the-art open multimodal model built upon datasets which include - synthetic data and filtered publicly available websites - with a focus on very high-quality, reasoning dense data both on text and vision. The model belongs to the Phi-3 model family, and the multimodal version comes with 128K context length (in tokens) it can support. The model underwent a rigorous enhancement process, incorporating both supervised fine-tuning and direct preference optimization to ensure precise instruction adherence and robust safety measures.

Links

https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-gguf

Tags

phi-3.5-moe-instruct

Phi-3.5-MoE is a lightweight, state-of-the-art open model built upon datasets used for Phi-3 - synthetic data and filtered publicly available documents - with a focus on very high-quality, reasoning dense data. The model supports multilingual and comes with 128K context length (in tokens). The model underwent a rigorous enhancement process, incorporating supervised fine-tuning, proximal policy optimization, and direct preference optimization to ensure precise instruction adherence and robust safety measures.

Links

Tags

Model Gallery

Filter by type:

Filter by tags:

noromaid-13b-0.4-DPO

llava-1.6-vicuna

llava-1.6-mistral

llava-1.5

phi-2-chat:Q8_0

phi-2-chat

phi-2-orange

phi-3-mini-4k-instruct

phi-3-mini-4k-instruct:fp16

phi-3-medium-4k-instruct

cream-phi-3-14b-v1

phi3-4x4b-v1

phi-3.1-mini-4k-instruct

phillama-3.8b-v0.1

calme-2.3-phi3-4b

phi-3.5-mini-instruct

calme-2.1-phi3.5-4b-i1

phi-3.5-mini-titanfusion-0.2

phi-3-vision:vllm

phi-3.5-vision:vllm

phi-3.5-moe-instruct