# Triangle **Repository Path**: mirrors_microsoft/Triangle ## Basic Information - **Project Name**: Triangle - **Description**: No description available - **Primary Language**: Unknown - **License**: MIT - **Default Branch**: main - **Homepage**: None - **GVP Project**: No ## Statistics - **Stars**: 0 - **Forks**: 0 - **Created**: 2025-11-16 - **Last Updated**: 2025-12-15 ## Categories & Tags **Categories**: Uncategorized **Tags**: None ## README # Triangle: Empowering Incident Triage with Multi-Agents Triangle is an end-to-end incident triage system using multiple LLM agents to route incidents to appropriate teams. It addresses challenges in cloud service incident management through semantic distillation and multi-role agent negotiation. Experiments show Triangle improves triage accuracy by over 20% while reducing response time. The system has been successfully deployed at a leading technology company serving millions of users. ## 🌟 Overview This project implements an intelligent incident triage system using multiple Large Language Model (LLM) agents to analyze and route incidents to the most suitable teams. The system combines TF-IDF similarity matching with semantic analysis to achieve accurate incident assignment. By leveraging both rule-based and machine-learning components, Triangle ensures adaptability, scalability, and continuous improvement over time. ### Key Goals 1. **Efficiency**: Reduce incident response time and streamline communication across different teams. 2. **Accuracy**: Increase correct team assignment by understanding the semantic context of incoming incidents. 3. **Scalability**: Seamlessly integrate new capabilities and scale to handle an increasing number of incidents. 4. **Extensibility**: Allow new routing policies, data sources, and integration points to be added with minimal overhead. ## πŸš€ Features - **Multi-Agent Architecture** - **Triage Decider**: Makes final routing decisions. - **Team Manager**: Handles team information and negotiations. - **Analyzer**: Performs semantic analysis and TF-IDF matching. - **Intelligent Matching** - TF-IDF based similarity scoring. - Semantic analysis of incident descriptions. - Multi-hop routing capability for complex incident redirections. - Team function phrase matching for increased accuracy. - **Performance Tracking** - Real-time accuracy monitoring through dashboards. - Detailed logging of decisions for post-mortem analysis. - Result analysis and visualization for iterative improvements. - **Confidence Estimation** - Confidence scores for each triage decision. - Threshold-based auto-assignment or manual review process. ## βš™οΈ Architecture The system consists of three main components, each performing specialized tasks to ensure consistency and efficiency: 1. **Triage Decider (TriageDecider)** - Gathers incident data from the Analyzer and Team Manager. - Matches incidents with relevant teams based on confidence scores. - Provides traceable reasoning for each routing decision. 2. **Team Manager (TeamManager)** - Maintains team capability profiles and summary key phrases. - Negotiates and escalates incidents when multiple teams are possible matches. - Ensures that team availability and load constraints are respected. 3. **Analyzer (Analyser)** - Performs TF-IDF analysis to derive initial similarity ranks. - Conducts semantic distillation of incident data. - Merges findings to produce final similarity and confidence metrics. ## ⏱ Performance Tracking and Metrics Triangle continuously monitors performance indicators to evaluate its effectiveness: | Metric | Description | |-------------------------|----------------------------------------------------------| | Accuracy | Percentage of correct team assignments | | Response Time | Average triage completion time | | Escalation Rate | Frequency of manual interventions or reassignments | | Confidence Score Mean | Average confidence for automated triage decisions | By measuring these metrics over time, Triangle helps identify improvements and ensure consistent, data-driven enhancements to incident triage workflows. ## πŸ“‹ Requirements - Python 3.8+ - Azure OpenAI API access - Required Python packages (see requirements.txt) ### Recommended Environment - A stable internet connection for reliable LLM access. - Sufficient resource allocation (CPU/Memory) for larger incident volumes. ## πŸ› οΈ Installation & Setup 1. **Clone the repository** ```bash git clone cd triangle ``` 2. **Setting Up the Virtual Environment** Follow these steps to create and activate a virtual environment, then install the required packages from `requirements.txt`. ### Prerequisites - **Python Installation**: Ensure Python is installed on your system. You can download it from the [official Python website](https://www.python.org/downloads/). - **Pip Verification**: Verify `pip` is installed by running `pip --version` in your terminal or command prompt. ### Steps 1. **Create a Virtual Environment** Open your terminal (Linux/Mac) or command prompt (Windows) and navigate to your project directory: ```bash cd /path/to/your/project ``` Create a virtual environment named `venv`: ```bash python -m venv venv ``` 2. **Activate the Virtual Environment** - **Windows**: ```bash .\venv\Scripts\activate ``` - **Linux/Mac**: ```bash source venv/bin/activate ``` After activation, your terminal prompt will change to indicate that the virtual environment is active. 3. **Install Required Packages** With the virtual environment activated, install the dependencies listed in `requirements.txt`: ```bash pip install -r requirements.txt ``` This command will read the `requirements.txt` file and install all necessary packages. 4. **Configure Azure OpenAI Credentials** Create a `config.json` file with your Azure OpenAI credentials: ```json { "ENDPOINT_URL": "your_azure_endpoint", "DEPLOYMENT_NAME": "your_deployment_name", "API_VERSION": "your_api_version", "API_KEY": "your_api_key" } ``` ## πŸ“Š Data Format ### Team Data (person.json) ```json [ { "name": "team_name", "summary_key_phrases": ["key_phrase1", "key_phrase2", ...] } ] ``` ### Incident Data (dataset.json) ```json [ { "case": "incident_id", "message": "incident_description", "last_person": "assigned_team" } ] ``` ### Advanced Topics for Data Management - **Data Versioning**: Use Git LFS or specialized tools to manage large datasets and historical changes. - **Privacy & Security**: Ensure that only sanitized or anonymized data is shared where needed, and follow your organization's data handling policies. ## 🎯 Usage Run the main triage system: ```bash python triangle.py ``` You can customize parameters in `triangle.py` to adjust agent behaviors, logging levels, or threshold settings for confidence scores. ## πŸ“ˆ Results When the triage process completes, the system generates detailed results in the `results` directory, including: - Assignment decisions for each incident - Confidence scores - Performance metrics - Routing paths Review these logs continuously to identify recurring issues and potential improvements in your triage logic. ## πŸ“– Contributing We welcome contributions from the community to make Triangle even better: 1. **Fork the Repo** and create your branch from `main`. 2. **Implement Features** or bug fixes in alignment with the project’s guidelines. 3. **Open a Pull Request**, detailing your changes, improvements, and testing for easy review. ## ❓ FAQ | Question | Answer | |----------------------------------------|---------------------------------------------------------------------------------------------------| | How do I add a new team? | Add a new JSON object in person.json and include relevant key phrases that describe the team’s domain. | | How do I retrain or refine the model? | Update your training scripts using new incident data, then adjust the Analyzer module accordingly. | | Is on-prem deployment supported? | Yes, you can run Triangle self-hosted, but you need a stable internal environment for the LLM API. | ## πŸ”Ž Limitations & Future Work 1. **Language Coverage**: While the system supports English data, non-English data may require additional adjustments in the Analyzer. 2. **Contextual Knowledge**: Domain-specific knowledge bases can help enrich the semantic matching but are currently not fully integrated. 3. **LLM Dependence**: Triage decisions depend on the accuracy, availability, and cost of LLM services. 4. **Future Enhancements**: Plans include adding advanced multi-modal interfaces (voice, images) and incorporating user feedback loops for continuous learning.