Pitfalls and Possibilities: Building Products With AI

First off, I want to make sure you know, I really wrote this blog without AI assistance!

I feel like that’s going to need to be a standard statement at the start of everyone’s future blog posts, sort of like political candidates saying they approved whatever message you’re about to have (and probably with about the same amount of truth and value).

Anyway, that’s not why I’m here. I do want to talk about AI, though, which is hard for me because I’ve been deeply committed to being uninterested since the start of the current craze. However, just like those persistent college fundraising departments, it’s almost impossible to escape the steady flow of news, hype, and fear the pursuers and pundits are generating.

While I’ve been passively paying attention to general developments and trends, I’ve recently become more actively interested in the technology and where it’s going, mostly because I know at this point it’s not going away.

I do believe these technologies will fundamentally affect the daily personal and work lives of people all over the planet. Many are quick to jump to the extreme positive and negative consequences. These conversations are super valuable, and I think even non-technologists should gain a healthy enough appreciation and understanding of these technologies to meaningfully navigate their impacts.

Even as a technologist with what I would consider a healthy understanding of current and coming capabilities, it’s really hard to keep up with the rate of change. While I can imagine all kinds of good (and bad) applications for these rapidly developing technologies, I’ve been trying to learn what they could do today to affect workflows and business processes that are familiar to me, such as building software prototypes or managing SaaS products.

Pitfall: LLM Edition

In summary, I think it’s going to be awhile before building and maintaining legitimate software products is as simple as typing in a few sentences. While AI is really helping today’s designers, product managers, and engineers be more effective, there are a few missing pieces preventing their roles from being replaced with full automation.

One giant limitation is the need for orchestration. For large-scale tasks to be completed, often many parallel and interrelated subtasks are necessary. Breaking down these large projects into smaller pieces is a fundamental part of every human expert’s job, relying on previous experiences to provide guidance on future work. We use tools, collaboration, and iteration to build plans that maximize the capabilities of everyone working on the team. We even have roles like product managers whose primary value is ensuring that everyone else has the information and support they need to keep work moving forward.

Large Language Models (LLMs), on the other hand, operate fundamentally differently than the traditional approach to project work. AI is exceptional at task completion. This makes sense as most of these models are trained to accept input text and determine what is the most reasonable text to append after it. When you end your request to the LLM with a question or a code fragment, it’s going to give you a response, and much of the time it’s going to be reasonably correct.

Task completion, though, is just part of the process required to build stuff. In a complex software development project, for example, an engineer is often directed to write some code that is both dependent on and depended upon for other parts of the project. Code that stores information relies on a data model, perhaps maintained by an architect, that must remain consistent for every coding task. Other engineers may use code that is produced as a basis for other tasks, preventing repetitious and wasteful cycles. The effective coordination of all this work is essential for building software successfully.

LLMs can complete very complex and large tasks. However, as a black box, the more that is asked of it, the less likely it is to conform to expectations as to how the work is done. This flaw is true of humans, too–the larger and more general the scope of what we need to do on our own, the more likely we are to do it our own way. However, the very idea of automation implies that we should expect some predictable result that can be expected to happen over and over.

This is where having a healthy understanding of what the technology can do today and, more importantly, how to fill in the gaps, can be really interesting.

Cue the Orchestra…

Back to orchestration. If LLMs are great at completing tasks, and we want reliable and predictable output, it seems that we need to find a way to replicate the human processes of decomposition, collaboration, and coordination into the technology. We need to make these processes work together and respond well when expectations aren’t met. To me, orchestration is the opportunity here for innovation.

For sure, the text of how these individual processes work is buried in the training data that was used to create these models. A chat LLM will tell you 1,000 different ways that you can build, design, and test software. Many researchers and other interested parties are working on ways to exploit this inherent knowledge in a way that can help get even more complete work out of a given prompt or chat session.

In the meantime, though, I think coordinating many, many smaller units of more predictable output is a far greater use of today’s technology and includes some pretty great benefits. First and foremost, while humans are still involved, evaluating and coordinating smaller nuggets of work provides visibility and comprehensibility to an otherwise opaque technology. Secondly, similar to human teams, work can be more precisely allocated to a trained “expert” that has experience and skills most appropriately suited to the task.

I’ve been very interested in some of the projects where researchers and teams are working on automating teams of agents that take a simple request and attempt to distribute the work to independent LLM agents who can coordinate both with each other and humans in-the-loop. Microsoft’s Autogen, for example, shows promise for coordinating and automating larger work projects while addressing the limitations of LLMs. However, it does not appear that Autogen is ready today to tackle large-scale projects. In addition to some of the inherent challenges we’ve already discussed, projects like Autogen need to find a way to deal with agents that run amok when tasks are too large or don’t have well-defined completion criteria.

This is again where I think humans injected into the process can be very valuable. While the kinks are still being worked out on purely automated large-scale task completion, injecting proven AI capabilities at just the right moment in existing processes might be just the solution for now. I believe this is already happening within the silos of expert jobs–engineers have Copilot, designers have Dall-E, and everyone has SaaS tools enabled by a bevy of AI features. It seems natural to think that a framework and/or system for orchestrating and assisting the decision as to when work tasks could be done by automation or require human feedback would be an improvement.

As with most technology, managing expectations is essential. While it doesn’t seem to me like software product teams are on the verge of being replaced, they are certainly at risk of being left behind without working to understand the current state and future possibilities of all these incredible new technologies. Using these capabilities realistically to improve existing work seems like the best way to ensuring a great role for human experts for a long time to come.

Pitfalls and Possibilities: Building Products With AI

Pitfall: LLM Edition

Cue the Orchestra…

Matt