# rustgpt-python **Repository Path**: DieAtMost/rustgpt-python ## Basic Information - **Project Name**: rustgpt-python - **Description**: No description available - **Primary Language**: Unknown - **License**: Not specified - **Default Branch**: master - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-09-22 - **Last Updated**: 2025-09-22 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # RustGPT Python Implementation This is a Python port of the RustGPT language model implementation. It includes a simple transformer-based language model with pre-training and instruction tuning capabilities. ## Features - Transformer architecture with self-attention, feed-forward layers, and layer normalization - Adam optimizer for training - Token and positional embeddings - Causal masking for autoregressive generation - Pre-training on simple text completion tasks - Instruction tuning for conversational AI - Interactive mode for testing the model ## Installation 1. Install dependencies: ```bash pip install -r requirements.txt ``` 2. Run the model: ```bash python main.py ``` ## Architecture The model consists of: - **Embeddings**: Token and positional embeddings - **TransformerBlock**: Self-attention + feed-forward with residual connections and layer normalization - **OutputProjection**: Final linear layer to project to vocabulary size - **LLM**: Main model class that orchestrates training and inference ## Training The model is trained in two phases: 1. **Pre-training**: On simple factual statements to learn basic language patterns 2. **Instruction tuning**: On conversational question-answer pairs to learn to follow instructions ## Usage After training, the model enters interactive mode where you can ask questions and get responses. Type 'exit' to quit. ## Documentation For a comprehensive understanding of the project, please refer to the following documentation: ### English Documentation - [Project_Overview.md](Project_Overview.md) - Complete project overview with file structure and reading guide - [Quick_Start_Guide.md](Quick_Start_Guide.md) - 5-minute quick start guide with usage examples - [AI_Principles_Guide.md](AI_Principles_Guide.md) - In-depth explanation of AI and transformer principles ### Chinese Documentation - [项目总览.md](项目总览.md) - 项目总览,包含文件结构和阅读指南 - [简易使用指南.md](简易使用指南.md) - 5分钟快速上手指南 - [AI原理科普.md](AI原理科普.md) - AI原理深度解析 - [中文说明文档.md](中文说明文档.md) - 详细的中文说明文档 ## Constants - `MAX_SEQ_LEN`: 80 (maximum sequence length) - `EMBEDDING_DIM`: 128 (embedding dimension) - `HIDDEN_DIM`: 256 (hidden dimension in feed-forward layers) - `NUM_HEADS`: 4 (number of attention heads)