A Digitized New Haven Police: Investigating Axon’s Draft One

Yale Herald

1 year ago

Interviews have been lightly edited for clarity.

The first image of a police officer that comes to mind is probably of one in action. Few imagine the cop spending hours behind their desk, writing report after report in high detail, filling out paperwork for constituents, or sending hundreds of emails. Yet at the New Haven Police Department (NHPD), these administrative tasks are more than routine. New Haven Police Chief Jacobson’s office seemed much larger than others at 1 Union Avenue, with full-size windows spanning the entire left wall and several paper-filled desks. Paper files and forms could be found in every corner.

A typical police report summarizes the facts and events of a case. They typically include physical details of those involved—clothing, ethnicity, body language,—shortened dialogue, and descriptions of crime scenes. While reports are not always admitted as court evidence, they remain essential to the legal process. They remind officers of the details of a case, reduce errors in recollection, and serve as resources of information for the public who want to look into the specifics of a case. Reports are also very time-intensive: their increased specificity and the fact that no two cases are alike means that officers can spend hours each day behind the computer. The search for a technology that could reduce an officer’s time with paperwork thus became essential, and that led the Chief to look for one.

When I spoke to Chief Jacobson for the first time, he relayed how training newly hired officers to write such reports has changed. It became clear that he loved to share stories. Especially when it comes to his department’s more unique cases, Chief Jacobson had a relevant story for every topic. “I remember bringing a new cop and saying, okay, on such-and-such a date at such-and-such a time, that’s what [a report] would start with. But I just sitting there thinking, ‘Okay, what do I say from here? What do I say?’” he recalled. Chief Jacobson paused shortly, then added. “But AI’s just going to spit that out for you.”

When writing reports can take up approximately one-fourth of an officer’s workday, according to Chief Jacobson in the New Haven Independent, artificial intelligence seems like the best possible solution. With testing reported by NHPD to begin by the end of this this February, AI-generated reports promise efficiency. But as officers struggle to explain how the system works, the question remains: Can AI improve efficiency in policing—or will it raise transparency concerns?

***

During a City Hall meeting on November 19, 2024, Chief Jacobson reported to the Board of Alders Public Safety Committee that officers often spend at least two hours of their eight-hour shifts working on reports. His cover letter, a written proposal of what exactly NHPD wants to purchase and why, to the Board of Alders suggested buying tasers, safety equipment, cameras, and a new program from the security company Axon Network: Draft One.

Draft One is a generative artificial intelligence program, a large language model (LLM), and a summarizing tool. A general LLM uses three steps—generating data, annotating data with meanings and labels, and verifying accuracy— on large collections of text and speech to piece together new, sensible sentences. Using the software engine OpenAI, the program summarizes audio tracks from officers’ body cameras into a preliminary draft report. According to a recent statement from Axon, Draft One aims to “reduc[e] the administrative burden on officers, allowing them to be more effective in the field.” Police officers will then “review and sign” the draft, editing the report to include visual details like race, sex, and body language. On December 16, 2024, the Board of Alders approved the $7,600,000 contract, with a plan to purchase all the new equipment and software over the next five years. Upon the approval, Chief Jacobson’s letter to the Board of Alders emphasized efficiency: his officers, he believed, do not spend enough time visibly patrolling the streets, which he calls “proactive policing.”

According to the NHPD’s routine end-of-year statistics, while New Haven’s overall “violent crime” rate—including homicides, rapes, and robberies—dropped in 2024, less serious types of crime such as car thefts, burglaries, and assault with firearms rose. The search for a technology that could reduce an officer’s time spent on paperwork thus became Chief Jacobson’s impetus for establishing Draft One. According to him, the program’s implementation will take one year, beginning sometime in February, and during the first six months, 15 to 30 officers, three sergeants, and one lieutenant from the NHPD will use Draft One, and only on what the department considers “small-time” crimes. These cases—such as car thefts, bar fights, and petty shoplifting—are routine to the department and thus it is easier to compare generated reports. Draft One will roll out to the rest of the department over the second six months to a year, expanding its usage to more serious, technical, and violent crimes.

Yet, public language that articulates Draft One’s mechanisms and usage is scarce. Axon does not have Draft One contractors in districts other than New Haven: just one case study with the Lafayette Police Department is cited on the company’s website. Since December, the NHPD has not released major updates on Draft One’s implementation—with no press release clarifying Draft One’s implementation date this February, or posts on social media about the new technology since the City Hall meeting.

Since December, the NHPD has not released major updates on Draft One’s implementation—with no press release clarifying Draft One’s implementation date this February, or posts on social media about the new technology since the City Hall meeting.

New Haven citizens have very few avenues to learn more about why the NHPD would want to implement Draft One or how they would use AI in such critical and sensitive situations. Opacity only exacerbates public distrust of the police: the lack of transparency makes it difficult to counter public fears about AI and properly identify the NHPD’s desired effects for Draft One. As plans to train officers in generative AI drafts march forward, it is important for government institutions to explain the mechanisms and uses of an LLM like Draft One, and for specialized researchers to advise the NHPD if AI in policing is truly inevitable. Draft One’s implementation is predated by decades of AI history, which the NHPD should elaborate on further in their public outreach. To start, New Haven citizens and the NHPD must understand where AI comes from.

***

Generative AI became popular chiefly with the release of ChatGPT in 2022, but artificial intelligence has been well-integrated into common technologies for decades, from spell-checkers and video game development. In 1950, Alan Turing introduced The Imitation Game, a test of machine intelligence that posits that if a human interrogator cannot differentiate between human and machine responses, then the machine has philosophically exhibited “intelligent” behavior. This benchmark remains influential as companies continue developing AI products that simulate human-like interactions.

Major tech firms like Apple, Google, Meta, OpenAI, and DeepSeek have been racing to advance generative AI for decades. Generative AI, the focus of the race right now, uses large datasets to predict what the user wants to see, predict the next word in a sentence or continuation of a photo, and then make it. Companies often struggle to clearly explain how these models work—sometimes due to the complexity of the technology, other times because mystery and novelty generate profit.

That was just a few years ago: the introduction of AI-generated police reports has sparked debate over the risks and benefits of automating such a critical function. For some, it may sound irresponsible, lazy, and unsafe to put something as important and potentially life-altering as police reports in the hands of AI. As an attempt to address those concerns, Draft One contains several “safeguards” to “maintain accuracy and transparency,” including officer reviews similar to Google Docs’s “suggesting” feature, and sign-off; proofreading controls that require an officer to check information; an “unalterable digital audit trail” which simply inserts a disclaimer that the draft was written with AI at the start or beginning of the document; and usage limited to only small crimes for now.

Despite these safeguards, Axon is known for lacking transparency and ethical violations. Details about Draft One’s data sources and training methods remain vague. Axon has stated that Draft One’s system was built on top of Microsoft’s Azure OpenAI Service, and follows Microsoft’s Code of Conduct. OpenAI, formerly transparent about its algorithm’s statistics and training, stopped disclosing such information in 2022, reflecting a broader industry shift toward secrecy regarding labor and ethics. Axon also cited their internal Ethics and Equity Advisory Council to ensure ethical production and usage of Draft One, but internal oversight is less rigorous than independent or government regulation. In 2022, the majority of Axon’s ethics board resigned because they fundamentally disliked the CEO’s taser drones, which unequivocally raised internal concerns about Axon’s values.

In 2022, the majority of Axon’s ethics board resigned because they fundamentally disliked the CEO’s taser drones, which unequivocally raised internal concerns about Axon’s values.

Furthermore, Axon clarified that it, Microsoft, and OpenAI are “strictly prohibited from sharing or using [agency customer] content for AI training.” But that simply means AI-generated content will not be used to further train the system, which is standard. This does not address concerns about customer privacy and how Axon will store sensitive crime data that are inherent to police reports.

According to Chief Jacobson, Draft One’s rollout is structured over five years: during the first of the five contract years, only three sergeants, a lieutenant, and the group of 15 to 30 officers were selected for testing purposes. Over the next six months to a year, Draft One will expand beyond “small-time” crime reports to the entire department, including arrest reports. Six months may seem like a short time. Given that the experimental contract spans 63 months and costs millions, allocating just one-tenth of the contract’s duration for testing could be seen as rushed. However, Chief Jacobson emphasized that the city has a financial safeguard: Draft One will be implemented. If it fails, the Board of Alders was promised they would be refunded the entire cost, or part of the cost, although the specifics of this agreement remain unclear.

Chief Jacobson, however, remains optimistic about the technology. To him, New Haven is a site of police innovation, citing historical firsts such as the early adoption of walking beats and community policing in the 1990s. He also views Draft One software as a cost-effective investment, arguing that acquiring the software early gets ahead of future price hikes. “I think through technology, the new cops that come out will learn how to write a police report just like everyone else, but they’ll also, in the academy, learn the right way to do AI too,” Jacobson said. “New Haven could be one of the pioneers for that.”

Chief Jacobson represents a larger movement of leaders who embrace AI as inevitable, yet hold Draft One accountable to transparency and ethical standards. There is a real—but avoidable—risk that officers do not understand Draft One deeply enough, letting fake details and misinterpreted audio slip by. Axon has not been as forthcoming as the NHPD and has knowingly partnered with companies that are deeply unethical, particularly OpenAI. Simultaneously, having NHPD officers deeply understand AI mechanisms will allow them to pick up on subtle cues when the technology is wrong.

To better understand the mechanics of LLMs and digital security, I spoke to researchers, professors, and community members, who have weighed in on its implications. Professor Vahid Behzadan, Director of the Secure and Assured Intelligent Learning Lab and Assistant Professor of Computer Science and Data Science at the University of New Haven, emphasizes the broader integration of AI in daily life. “You’ve seen that in colleges, many students are using LLMs, or some form of AI, to speed up or to get help with regards to writing assignments,” Professor Behzdan said. “Writing police reports, I believe, is no different.” He argues that AI-generated drafts still organically require significant human effort—officers must describe visual events, preserve syntax structure, and focus on the narrative, adding details that Draft One could not know about, such as body language and race.

To Professor Behzadan, analyzing how language models function is the most critical aspect. AI predicts text by analyzing vast amounts of data but verifying the accuracy of its output is important and well-understood. “A speech-to-text model takes in the audio and…mapped them into sentences and words in the written form, perhaps a model for validation or verification of the transcription,” he said. “If certain words seem out of place, other models or extensions of the transcription model can be used to come up with possible corrections or possible variances of the original transcript to minimize the error in what’s produced for the end user. Once the transcript is ready, it’s just like any other piece of text.” However, this process appears at odds with what Chief Jacobson calls the system’s “gibberish” feature, which will leave unclear audio as unintelligible instead of attempting to parse it out. All of this ultimately returns to the NHPD officer, who must acquire personal expertise on Draft One’s mechanisms and share that knowledge with community members. It has now moved beyond a question of accuracy, but of fairness to humans as well.

***

Assistant Professor of American Studies Julian Posada’s passion for the human workers behind artificial intelligence cannot be overstated. The giant shelf of books in the background of his Zoom personifies his passion. He would turn to his bookshelf whenever recommending another text or scholar on AI research—names and names of fellow colleagues who are all interested in the ethics of human labor creating machine learning programs. Professor Posada shares concerns about the ethical implications of AI development. He highlights that laborers in Venezuela and other developing nations who record sounds, copy texts, take photographs, and annotate all this data with definitions are usually overlooked. “Venezuela was targeted by many platforms at the time to find workers for this type of [AI development].”

Professor Posada also warned of the emerging usage of AI to create fake, unverifiable photos and speech, particularly in the wake of recent scandals involving AI-generated fake gore or pornographic material. “Deepfakes,” for instance, is a catch-all term for doctored voices and faces, and these deepfakes have been touted by politicians and activists alike.

While Professor Behzadan and Professor Posada do not know the specifics of Axon’s designs for Draft One, they underscore the need for transparency in Draft One’s implementation. Professor Posada suggests that police officers need to become relative experts on how Draft One works to prevent ethical lapses. “The hope is that places like the New Haven Police Department and other organizations can know which companies are using ethical labor [and] ethically sourcing labor for their products, like you do for any other industry,” said Professor Posada. In other words, the key is to treat generative A.I. like any other industry: any security breach or opacity will reflect on the NHPD, which is why constant, long-term verification is necessary.

“The hope is that places like the New Haven Police Department and other organizations can know which companies are using ethical labor [and] ethically sourcing labor for their products, like you do for any other industry,” said Professor Posada.

The professors also shared mechanical suggestions for any tech firm to keep in mind. “It is also possible to add a loopback process in the end with the same LLM or a different LLM for one or more rounds of verification through asking questions or prompts that would check to see whether the correct names are used, whether the events or the timeline of events in accordance to what’s in the transcript and so on.” He notes that AI models have been known to “hallucinate” incorrect names or details, which, if unchecked could lead to serious legal consequences.

Chief Jacobson, however, argues that officers remain responsible for any inaccuracies, stating that legal accountability will prevent this. “When the officer signs an authentic report, now it’s on [them]. It doesn’t matter if you used AI or if you wrote it in crayon. It’s on the responsible officer.” The expectation of accountability can motivate officers to thoroughly check their reports for accuracy and nuance. It prevents officers from scapegoating AI and rewards checking twice or three times over. Yet, the issue of auditory versus visual importance, and how biased datasets will likely be a major issue, must be supervised in the coming years. As with any machine, Draft One will not be free of error. It may seem that way when, after dozens of uses, an officer cannot spot any issues. But at some point, Draft One can make a mistake—a misquoted word, or a phrase that was deemed gibberish by the machine but could turn the case around—imparting severe legal consequences.

A researcher who spends a lot of time with LLMs, like Professor Posada and Professor Behzadan, will continually engage with and critique the texts that AI writes. They will know what to expect and how to improve the program. However, an officer who spends considerably less time generating reports and focusing on proactive policing will not be as familiar with Draft One’s mechanisms. They cannot edit the software for themselves. Axon likely will not be satisfactorily responsive to officers’ calls for updates and improvements, based on the vague language used in their website and statements. In other words, officers might replace time spent writing reports with time spent haranguing OpenAI. Or, they will uncritically accept whatever flawed technology they have. Strained communication between well-meaning users and experimental technology could lead to mass confusion or fatigue, but that can ideally be prevented with persistent communication with Axon.

Lastly, an often overlooked concern about summarization and transcription AI like Draft One is how accurately it can identify factors like race, nationality, and gender. As Professor Posada explains, AI training datasets can often be skewed to overrepresent the faces, voices, and accents of white males. “In 2018 or 2019, my colleagues—[director of the Distributed AI Research Institute] Timnit Gebru and [AI researcher] Joy Buolamwini—looked at facial recognition algorithms and data sets. They found that the accuracy is lower for people with darker skin and also those defined as female. In combination, black women or women with darker skin were at the bottom in terms of accuracy.” With LLMs, this bias can arise in accents, for instance. An LLM may not summarize speech in a Caribbean accent or African-American English as well as it would a Midwestern accent. This can also raise issues when Draft One only utilizes audio without properly incorporating visual cues such as body language, age, and emotion. Now, Posada notes, people are self-disclosing their identity to prevent coded bias.

On the other hand, Chief Jacobson believes that the police officer reviewing and signing off on the transcript will contribute to the visual facts. In fact, the officer will be made to fill in the blanks that Draft One could not infer, thus aiming to maintain sufficient human oversight.

Community activists have complained about the obscurity surrounding the implementation timeline of Draft One. In November 2024, the American Civil Liberties Union released a statement condemning the use of AI in report writing, stating that generative technology is not trustable, truthful, or reliable. The NHPD should remain in constant contact with local organizations and nonprofits that can effectively share information and resources about AI with the public. That’s a part of “proactive policing” as well: not just police officers visibly watching the streets, but making first steps to educate unknowing citizens. New Haveners have the right to demand that the NHPD enforce reciprocal oversight over a digital security company that has largely escaped the public eye. Locals have the right to demand that police officers develop expertise in AI development, before using Draft One . Locals have the right to know more.