When OpenAI unveiled its first open-weight models in years this August, it wasn’t just tech companies that were paying attention. The release also excited US military and defense contractors, which saw a chance to use them for highly secure operations.
Initial results show that OpenAI’s tools lag behind competitors in desired capabilities, some military vendors tell WIRED. But they are still pleased that models from a key industry leader are finally an option for them.
Lilt, an AI translation company, contracts with the US military to analyze foreign intelligence. Because the company’s software handles sensitive information, it must be installed on government servers and work without an internet connection, a practice known as air-gapping. Lilt previously developed its own AI models or used open source options such as Meta’s Llama and Google’s Gemma. But OpenAI’s tools were off the table because they were closed source and could only be accessed online.
The ChatGPT maker’s new open-weight models, gpt-oss-120b and gpt-oss-20b, changed that. Both can run locally, meaning users have the freedom to install them on their own devices without needing a cloud connection. And with access to the models’ weights—key parameters that determine how they react to different prompts—users can tailor them for specific purposes.
OpenAI’s return to the open-source market could ultimately increase competition and lead to better performing systems for militaries, health care companies, and others working with sensitive data. In a recent McKinsey survey of roughly 700 business leaders, more than 50 percent said their organizations use open source AI technologies. Models have different strengths based on how they were trained, and organizations often use several together, including open-weight ones, to ensure reliability across a wide variety of situations.
Doug Matty, chief digital and AI officer for the so-called Department of War, the name the Trump administration is using for the Department of Defense, tells WIRED that the Pentagon plans to integrate generative AI into battlefield systems and back-office functions like auditing. Some of these applications will require models that are not tied to the cloud, he says. “Our capabilities must be adaptable and flexible,” Matty says.
OpenAI did not respond to requests for comment about how its open source models may be used by the defense industry. Last year, the company reversed a broad ban on its technology being used for military and warfare applications, a move that prompted criticism from activists concerned about harms caused by AI.
For OpenAI, offering a free and open model could have several benefits. The ease of access could cultivate a larger community of experts in its technologies. And because users don’t have to sign up as formal customers, they may be able to operate with secrecy, which could keep OpenAI from facing criticism over potentially controversial customers—like, say, the military.
Earlier this year, Matty’s unit at the Pentagon struck one-year deals worth up to $200 million each with OpenAI, Elon Musk’s xAI, Anthropic, and Google. The goal is to create prototypes of AI systems for different purposes, including automating war-fighting tools. Until OpenAI’s recent launch, Google was the only new tech partner that offered a cutting-edge open model as an option. The other companies license models that are run from the cloud and can’t be customized to the same extent as open models.
In Lilt’s case, CEO Spence Green says a military analyst may input a prompt like, “Translate these documents to English and ensure that there are no mistakes. Then have the most knowledgeable person about hypersonics check the work.” Lilt’s proprietary models, which are trained for government applications, handle the translation. Google’s Gemma currently automates routing which information goes to models, analysts, and other teams. The aim is to address a shortage of language experts and a backlog of data to process.
OpenAI’s latest open source models aren’t well suited for Lilt’s needs. They process only text, and the military needs to also sort through images and audio. Lilt also found the models underperform in some languages and in situations with limited computing power. But the results haven’t demoralized Green. “With gpt-oss, there’s a lot of model competition right now,” Green says. “More options, the better.”
Other companies that work with the military say they got good results from the gpt-oss models, but they aren’t aware of any Pentagon projects using them that have moved past the demo stage. “It’s pretty early,” says Jordan Wiens, cofounder of Vector 35, which supplies reverse engineering tools to the Pentagon and has integrated gpt-oss into its offerings.
EdgeRunner AI, which is developing a virtual personal assistant for the military that doesn’t require a cloud connection, says it achieved sufficient performance with gpt-oss after feeding it a cache of military documents to modify its capabilities, according to a paper the company published in October. The US Army and the Air Force will begin testing the modified model this month, says Tyler Saltsman, EdgeRunner’s CEO.
Open models may be particularly valuable in situations that require an immediate response or when internet interference could be an issue. That includes AI systems running on drones or satellites, says Kyle Miller, a research analyst at Georgetown University’s Center for Security and Emerging Technology. Open source AI models offer the military “a degree of accessibility, control, customizability, and privacy that is simply not available with closed models,” he says.
Beyond direct deals with AI providers, the military also has access to about 125 open source models and about 25 closed options through an intermediary AI platform called Ask Sage, says Nicolas Chaillan, the company’s founder and a former chief software officer for the US Air Force and Space Force.
Chaillan says there are serious drawbacks to using open source models, particularly for the US military. They hallucinate and make incorrect predictions more often than the best commercial models, he claims. And while they are often free for most uses, the infrastructure needed to run the biggest models may end up costing the same or more than licensing a commercial model over the cloud. “It’s like going from PhD level to a monkey,” Chaillan says. “If you spend more money and get a worse model, it makes no sense.”
He believes that the military should keep an eye on open models, but focus its efforts on using the more capable options that Microsoft, Amazon, and Google offer through cloud networks developed specifically for sensitive government tasks.
Other military suppliers and experts disagree, contending that closed models can lead to dependence issues and won’t meet the boutique needs of the armed forces.
Pete Warden, who runs the transcription and translation technology developer Moonshine, says his contacts in the defense world have become more cautious about trusting big tech companies after seeing how Musk used his Starlink satellite network to influence government leaders. “Independence from suppliers is key,” Warden says. His solution has been letting government agencies control a perpetual copy of Moonshine’s model in exchange for a one-time fee.
William Marcellino, who develops AI applications for the research group RAND, says open models that can be more easily controlled would help the military and spy agencies with projects such as translating materials for influence operations into regional dialects, a task that general commercial models may struggle to execute with precision. “It’s good to have choices,” he says.




