DeputyDev's 2025 and Beyond: Hurdles we are set to conquer
An unordered and dynamic list of challenges we're excited to solve for DeputyDev.
DD
Posted By DeputyDev Team
5 Minutes read.
As we prepare to say farewell to 2024, we took a moment to reflect on the past six months of building DeputyDev. It has been quite the roller coaster, filled with highs and lows, but every moment has been worthwhile. As we step into 2025, we have identified several key challenges we aim to tackle for DeputyDev.
Here is a list of the problems we want to solve:
#1 - Context is King
The concept of "Context is king" stands as a fundamental principle in problem-solving with Large Language Models (LLMs), highlighting how the quality of input directly influences the quality of output we receive.
Finding the perfect balance between conciseness and comprehensiveness in context provision is crucial - it's not about providing all available information, but rather about providing the right information.
The art of context curation involves careful decision-making about what to exclude, which can be just as important as deciding what to include, as irrelevant information can dilute the model's focus.
While Claude 3.5 Sonnet, which DeputyDev uses for code reviews and generation, offers a substantial 200K context window, maximizing this capacity isn't always the optimal approach.
There's an observed phenomenon where extending context length beyond certain thresholds can lead to diminishing returns in terms of result quality, suggesting that more isn't always better.
DeputyDev has already achieved notable success in optimizing its context selection processes, demonstrating progress in this critical area.
The current focus at DeputyDev lies in further refinement of our context selection algorithm (RAGLoC), with the primary challenge being algorithmic optimization rather than model capabilities or infrastructure limitations.
#2 - Broadening Context Horizons
This relates to our point #1 and represents a utopian scenario we strive to achieve. Context is not just 1 repo, It can span across multiple repos/microservices. We want to capture all relevant context.
While current AI code generation tools excel at creating standalone, smaller-scale projects from scratch, they face significant challenges when confronted with the complexities of enterprise-level software development.
In organizational environments, even seemingly minor feature updates can trigger a cascade of necessary changes across multiple microservices, highlighting the interconnected nature of enterprise systems.
The challenge of identifying and providing optimal context becomes particularly daunting when dealing with organizational codebases, as relevant information may be distributed across numerous repositories and services.
Our goal is to develop solutions that can effectively understand and process context across an organization's entire repository ecosystem, moving beyond the limitations of single-repository context.
By enabling comprehensive cross-repository context understanding, we aim to enhance AI's ability to generate code that is both technically sound and perfectly aligned with the organization's existing architecture and requirements
#3 - Code Generation
DeputyDev's foundational vision extends beyond being a mere code review tool, aiming to serve as a comprehensive assistant that supports developers throughout their development journey.
Having successfully stabilized our code review module, we're now strategically positioned to advance into the next crucial phase - tackling the core aspects of software development and coding.
Our existing RAGLoC algorithm provides a strong foundation for this expansion, giving us the technological infrastructure needed to support more advanced development features.
The user experience design for such a comprehensive system presents an intriguing challenge, with multiple potential interfaces under consideration - from IDE integrations and plugins to CLI tools and web applications, or potentially a harmonious combination of all these approaches. While the exact form of the interface is still evolving, our unwavering focus remains on creating an intuitive and seamless experience that enhances developer productivity.
Our ambitious vision for code generation capabilities encompasses several key features: automated test case creation, cross-language code translation, codebase modernization to leverage newer language versions, and sophisticated real-time code completion with next-token prediction functionality.
#4 - Effortless Onboarding with Dashboard
DeputyDev is an experiential tool and best appreciated through hands-on experience, as developers truly recognize its value when integrated into their workflow.
We prioritize making the onboarding process as frictionless as possible, enabling developers to integrate DeputyDev into their development environment with minimal effort and just a few clicks.
Our approach emphasizes self-service exploration, allowing users to discover and understand the product's features independently without requiring direct support or assistance.
A comprehensive dashboard will serve as the central hub for both initial onboarding and ongoing performance monitoring, providing users with easy access to critical metrics and functionality.
To facilitate this hands-on exploration approach, we're planning to implement either a free tier or trial-based model, ensuring developers can experience the full potential of DeputyDev before making a commitment.
#5 - Vertical Integration
DeputyDev currently leverages advanced AI models like GPT-4 and Claude 3.5 Sonnet through integrations with OpenAI and Anthropic, to drive its intelligence.
While these third-party models are highly capable, they present two significant challenges that impact our service delivery: data security concerns and inference speed limitations.
The security challenge stems from organization's understandable hesitation about their code and sensitive data being transmitted outside their network boundaries, despite clear data retention and privacy policies from our AI partners.
Performance limitations, particularly in terms of inference speed, can make these models too slow for real-time applications like code completion, potentially impacting the user experience.
Our strategic vision involves achieving vertical integration through model ownership, either by utilizing an open-source base model like Llama 3 or developing a fine-tuned version tailored to our specific needs.
This approach would enable us to offer on-premise deployments, addressing data security concerns by keeping sensitive information within client networks, while also allowing us to optimize inference speed through infrastructure improvements and additional GPU memory allocation.
While we acknowledge that this initiative requires substantial investment in terms of both effort and resources, with ROI not immediately apparent, we believe that at scale, it will enable us to offer more competitive pricing and enhanced data security to our customers.
#6 - The Platform Play
Instead of building point solutions for every workflow variant, we want to take a transformative step: platformizing DeputyDev. By exposing our core capabilities through robust APIs.
We want to put the power of RAGLoC in our customer's hands. This means:
Build Your Way: Create custom tools that align perfectly with your team's workflow.
Integrate Anywhere: Embed DeputyDev's capabilities into your existing tools and processes.
Innovate Freely: Experiment with new applications of RAGLoC in areas we haven't even imagined.
Organizations are spending countless hours building and maintaining custom internal tools because existing solutions lack the flexibility to be customized for specific needs and integration points.
The absence of a truly flexible, API-first platform for AI-powered development is forcing companies to either compromise on their workflow or miss out on the benefits of AI-assisted development entirely.
This isn't just about building another developer tool - it's about fundamentally reimagining how AI-powered development capabilities are delivered and consumed. By solving this, we're not just saving time; we're unlocking a new era of customized, efficient, and intelligent software development.
More ideas we are exploring:
Tool usage - Beyond code generation, DeputyDev aims to streamline the everyday tasks that developers perform, such as executing terminal commands, setting up files and folders, scaffolding project structures, and sourcing information from the web. These functionalities will serve as tools within DeputyDev, and we are exploring ways to intelligently integrate them to provide developers with a more seamless and efficient experience.
DeputyDev in terminal - GenAI's capabilities can be effectively utilized in the terminal too. Whether you're struggling to remember a specific bash command or need insights on resolving an issue using stack traces, our idea is that DeputyDev should be able to help you with almost everything within the terminal.
Evaluations - Evaluating LLM responses is crucial to ensure factual accuracy and prevent unintended performance declines. Although it's an active research area with various approaches, we're leaning towards using LLM-based evaluations—having one LLM assess another's responses. While it may seem like inception, our initial proofs of concept have yielded promising results.
We've tried gathering a list of the challenges we're currently tackling, but—as anyone whose day job is to think about the evolution of the product—fresh ideas and priorities keep popping up. So, don't consider this our definitive roadmap. We simply hope it gives you a glimpse into what occupies our thoughts each day.
If you're reading this, you're probably intrigued by the challenges we're tackling. If you have any thoughts, concerns, or advice, feel free to send us an email at [email protected]