Blockchain
Base (8453)
Token ID
30215930220488793788080665935884423659158641794762129232918117979548713265716
Description
# New Model: siliconmaid-7b
# Cognitive Core
## Package Name
siliconmaid-7b
## File
[Ansem](https://green-mad-elk-149.mypinata.cloud/ipfs/QmbaaDqnTAhHvTobozqCsmZ9sqtycG3sBk4fz6knUkiRoQ)
## How is this package better?
Credit to the model owner: **SanjiWatsuki.**
Silicon-Maid-7B is model targeted at being both strong at RP and being a smart cookie that can follow character cards very well. This model demonstrates impressive creativity by creating immersive chat experience with the user.
This model is a powerhouse built on two impressive 7B models:
- xDAN-AI/xDAN-L1-Chat-RL-v1, known for its unusually high score on MT-Bench, a benchmark that often reflects real-world performance.
- chargoddard/loyal-piano-m7, an Alpaca model that surprised everyone with its highly creative outputs.
It is interesting to see how well it performs in the real world task and roleplay.
Benchmark results:
**MT-Bench Average Turn**
| model | score | size |
| --- | --- | --- |
| gpt-4 | 8.99 | - |
| xDAN-L1-Chat-RL-v1 | 8.24^1 | 7b |
| Starling-7B | 8.09 | 7b |
| Claude-2 | 8.06 | - |
| Silicon-Maid | 7.96 | 7b |
| Loyal-Macaroni-Maid | 7.95 | 7b |
| gpt-3.5-turbo | 7.94 | 20b? |
| Claude-1 | 7.90 | - |
| OpenChat-3.5 | 7.81 | - |
| vicuna-33b-v1.3 | 7.12 | 33b |
| wizardlm-30b | 7.01 | 30b |
| Llama-2-70b-chat | 6.86 | 70b |
## How to run the model ————————
### The source of merge
```python
models: # Top-Loyal-Bruins-Maid-DARE-7B
- model: mistralai/Mistral-7B-v0.1
# no parameters necessary for base model
- model: xDAN-AI/xDAN-L1-Chat-RL-v1
parameters:
weight: 0.4
density: 0.8
- model: chargoddard/loyal-piano-m7
parameters:
weight: 0.3
density: 0.8
- model: Undi95/Toppy-M-7B
parameters:
weight: 0.2
density: 0.4
- model: NeverSleep/Noromaid-7b-v0.2
parameters:
weight: 0.2
density: 0.4
- model: athirdpath/NSFW_DPO_vmgb-7b
parameters:
weight: 0.2
density: 0.4
merge_method: dare_ties
base_model: mistralai/Mistral-7B-v0.1
parameters:
int8_mask: true
dtype: bfloat16
```
**Rational behind the merge**
I went with DARE TIES because it appeared to be a viable way to combine information into models without losing smarts.
By picking a density of 0.8, these models have a 96% chance of showing up for any TIE merger. This should ensure that there is a solid "base" of deltas from the base Mistral model that captures most of what makes these models good.
Next, there are 3 RP models merged in with medium density. Toppy-M-7B is an easy pick for being a well regarded 7B RP model - although, it is a merger of many mergers which might dilute its effectiveness as a lower density merge. NeverSleep/Noromaid-7b-v0.2 pulls in the unique private Noromaid RP dataset. Finally, athirdpath/NSFW_DPO_vmgb-7b is another Frankenstein OpenNeuralChat merger that happens to be DPOed on athirdpath's NSFW Alpaca pairs which seemed like another good RP addition to the model
By picking a density of 0.4, these models should *largely* impart some of their flavor onto the merger. I suspect the density could go even lower and the models could be used even more like a LoRA-like merger on top.
The DARE TIES merger is intentionally overweight and non-normalized at 1.3 total weight. I intentionally went overweight to try and better capture the individual characteristics from the various models.
**Prompt Template: Alpaca style prompt**
This model primarily uses Alpaca formatting, so for optimal model performance, use:
```python
Below is an instruction that describes a task. Write a response that appropriately completes the request.
### Instruction:
<prompt> (without the <>)
### Input:
<prompt> (if input exists)
### Response:
```
## Future improvements ————————
Future training aims to collect extensive datasets on John Wick comprising dialogues, character descriptions, and plot summaries extracted from the John Wick films.
**Limitations and Bias**
The model may not always fully understand the context or intent of your prompts, and its reasoning capabilities are still under development. This model, primarily trained on fictional stories, excels in captivating narratives and innovative text formats. Always double-check its outputs against reliable sources, be mindful of potential biases stemming from its fictional training, and utilize it for its creative prowess rather than real-world information retrieval. The model's training data may reflect societal biases, leading to potential biases in its outputs. It is important to be aware of these potential biases and interpret the model's outputs critically.
## How to run the model.
N/A
## Future improvements
N/A
---
*#proposer=0x1BAe63343831322A18345b2470cD620A982484e1*
*Commit=48d36912-cf08-4b10-8805-156c2dabb37d*