Large Language Models [LLM]

Llama 3.1

Llama 3.1

Meta's groundbreaking release of Llama 3.1, a collection of powerful open-source AI models. These models represent a significant leap in accesible AI technologies. The launch is comprised of a new flaghship 405B parameter model in addition to refreshes to the previously released 8B and 70B models.

Model Sizes + Architecture:

New Flagship Model:
Llama 3.1 - 405B parameter
- Refresh of existing 8B and 70B parameter models

Architecture:
Auto-regressive language model utilising optimised transformer architecture. Incorporates Grouped-Query Attention (GQA) for improved inference scalability

Context Length:
128,000 tokens for all model sizes (16x increase)

Training Data and Approach:

Pretrained Data-Set:
15 trillion tokens (from publicly available sources)
- Fine-tuning data over 25M synthetically generated examples
- Fine-tuning methodology (SFT) + (RLHF)

Data Cut-off:
December 2023

Multilingual Support:

Supported Languages:
English, German, French, Italian, Portuguese, Spanish, Hindi and Thai
- Optimised for dialogue based use-cases.

Performance

How Does Llama 3.1 Stack Up?
Level 3.1 Performance:



Explanatory Articles:

Fine-Tuning (SFT vs RLHF)
https://medium.com/@veer15/fine-tuning-vs-human-guidance-sft-and-rlhf-in-language-model-tuning-6ad54e1ba628

Model Sizes + Architecture:

New Flagship Model:
Llama 3.1 - 405B parameter
- Refresh of existing 8B and 70B parameter models

Architecture:
Auto-regressive language model utilising optimised transformer architecture. Incorporates Grouped-Query Attention (GQA) for improved inference scalability

Context Length:
128,000 tokens for all model sizes (16x increase)

Training Data and Approach:

Pretrained Data-Set:
15 trillion tokens (from publicly available sources)
- Fine-tuning data over 25M synthetically generated examples
- Fine-tuning methodology (SFT) + (RLHF)

Data Cut-off:
December 2023

Multilingual Support:

Supported Languages:
English, German, French, Italian, Portuguese, Spanish, Hindi and Thai
- Optimised for dialogue based use-cases.

Performance

How Does Llama 3.1 Stack Up?
Level 3.1 Performance:



Explanatory Articles:

Fine-Tuning (SFT vs RLHF)
https://medium.com/@veer15/fine-tuning-vs-human-guidance-sft-and-rlhf-in-language-model-tuning-6ad54e1ba628

Model Sizes + Architecture:

New Flagship Model:
Llama 3.1 - 405B parameter
- Refresh of existing 8B and 70B parameter models

Architecture:
Auto-regressive language model utilising optimised transformer architecture. Incorporates Grouped-Query Attention (GQA) for improved inference scalability

Context Length:
128,000 tokens for all model sizes (16x increase)

Training Data and Approach:

Pretrained Data-Set:
15 trillion tokens (from publicly available sources)
- Fine-tuning data over 25M synthetically generated examples
- Fine-tuning methodology (SFT) + (RLHF)

Data Cut-off:
December 2023

Multilingual Support:

Supported Languages:
English, German, French, Italian, Portuguese, Spanish, Hindi and Thai
- Optimised for dialogue based use-cases.

Performance

How Does Llama 3.1 Stack Up?
Level 3.1 Performance:



Explanatory Articles:

Fine-Tuning (SFT vs RLHF)
https://medium.com/@veer15/fine-tuning-vs-human-guidance-sft-and-rlhf-in-language-model-tuning-6ad54e1ba628

Model Sizes + Architecture:

New Flagship Model:
Llama 3.1 - 405B parameter
- Refresh of existing 8B and 70B parameter models

Architecture:
Auto-regressive language model utilising optimised transformer architecture. Incorporates Grouped-Query Attention (GQA) for improved inference scalability

Context Length:
128,000 tokens for all model sizes (16x increase)

Training Data and Approach:

Pretrained Data-Set:
15 trillion tokens (from publicly available sources)
- Fine-tuning data over 25M synthetically generated examples
- Fine-tuning methodology (SFT) + (RLHF)

Data Cut-off:
December 2023

Multilingual Support:

Supported Languages:
English, German, French, Italian, Portuguese, Spanish, Hindi and Thai
- Optimised for dialogue based use-cases.

Performance

How Does Llama 3.1 Stack Up?
Level 3.1 Performance:



Explanatory Articles:

Fine-Tuning (SFT vs RLHF)
https://medium.com/@veer15/fine-tuning-vs-human-guidance-sft-and-rlhf-in-language-model-tuning-6ad54e1ba628

Model Sizes + Architecture:

New Flagship Model:
Llama 3.1 - 405B parameter
- Refresh of existing 8B and 70B parameter models

Architecture:
Auto-regressive language model utilising optimised transformer architecture. Incorporates Grouped-Query Attention (GQA) for improved inference scalability

Context Length:
128,000 tokens for all model sizes (16x increase)

Training Data and Approach:

Pretrained Data-Set:
15 trillion tokens (from publicly available sources)
- Fine-tuning data over 25M synthetically generated examples
- Fine-tuning methodology (SFT) + (RLHF)

Data Cut-off:
December 2023

Multilingual Support:

Supported Languages:
English, German, French, Italian, Portuguese, Spanish, Hindi and Thai
- Optimised for dialogue based use-cases.

Performance

How Does Llama 3.1 Stack Up?
Level 3.1 Performance:



Explanatory Articles:

Fine-Tuning (SFT vs RLHF)
https://medium.com/@veer15/fine-tuning-vs-human-guidance-sft-and-rlhf-in-language-model-tuning-6ad54e1ba628

Copy To Clipboard ->

Lummi AI

© Copyright 2024. All rights reserved.