Project LLM Canary is an open source initiative started by UC Berkeley graduate students to address the security and privacy challenges within large language models (LLMs). The project intends to build a user-friendly tool that can evaluate and benchmark the security of fine-tuned LLMs, and aims to detect LLM vulnerabilities defined within the OWASP Top Ten for LLMs.
Under the LLM Canary project a small study was conducted to gain insights into user profiles, LLM usage, security concerns, and preferences related to security testing and benchmarking tools for LLMs. The survey included questions in multiple choice and short form questions related to LLM use and testing preferences.
There were 157 responses considered, any participants that had less than three months of experience were excluded from the results (three months or more was stated as a requirement to participate), as were any partially completed surveys. The final completed survey count consisted of 98 responses. Although the sample size is limited, the survey results provide the LLM Canary project with an indication of LLM practitioner preferences and usage patterns. The survey results are summarized in this report.
User Profiles
The majority of respondents said they were in a research division of their company followed by development operations and product development organizations. Product Managers made up the majority of participants at 31%, followed by 16% Data Engineers, 10% Product Engineers, and 10% LLM Developers. Not surprisingly, approximately 45% of respondents had worked with LLMs for 3-6 months and 30.6% for 6-12 months. The highest percentage of respondents worked on LLM integration (e.g. in applications, plug-ins), followed by security testing, fine-tuning foundation LLMs and prompt tuning.
LLMs and Working Environments
Integration was the top work focus (e.g. integrating LLMs and their applications) at 40%, followed by security testing, prompt engineering and then fine-tuning. While 25% of participants said their company worked with multiple LLMs (between 3-5), 21% said they worked with multiple LLMs directly (between 2-12).
The top 3 LLM providers that participants worked with were from OpenAI at almost 50%, Google at 20.6% and 9% said they worked with LLaMA from Meta. There was a fairly even split of cloud versus self-hosted versus a mix of public and privately hosted LLMs; 36.3% of participants said their LLMs were cloud hosted, 29.4% said their companies host the LLMs directly and 27% had a mix. A follow-on survey could include a question to specify LLM hosting environments for production versus development environments, which could surface indications of security and/or privacy concerns related to LLM hosting and preferences.
LLM Training Data, Testing and Integration
While 44% of participants said they used open source datasets to train their LLMs, it’s worth noting that 40% used proprietary company data and 24% said they used sensitive or Personally Identifiable Information (PII). LLM security development and/or testing would ideally be integrated into current workflows/pipelines as a testing harness fully developed inhouse by 26% of participants, 19% said they would prefer to use a pre-built test harness with custom developed tests, and 18% said they would prefer a standalone interactive GUI testing tool. When asked, 44% said they integrate their LLM(s) with third party service providers or plug-ins, and 34% said no.
Security and Privacy Concerns, and Reports
The top LLM vulnerability concern surfaced is sensitive information disclosure followed by insecure output handling, and then prompt injections. When asked who they would share the results of an LLM security vulnerability report with several audiences applied, most said their managers and security operations teams and developers, followed by product managers, and then executives.
Most people said a summary of the test results or risk score ranking by vulnerability type would be most useful to see in a report, and then a summary of the pass/fail test results and a comparison to an industry LLM benchmark score.
Additional Concerns
When asked what other concerns participants had, the vast majority of freeform responses were distributed between data protection and integrity of output. Many responses in the survey aligned with OWASP Top 10 for LLMs; participants highlighted the top-of-mind risks associated with ‘sensitive information leaks’, ‘data protection’, ‘security breaches’, ‘accidental confidential leaks’, ‘bias’, and ‘data leakage’.
Data protection, privacy and confidentiality surfaced as top concerns supported by the fact that 64% of the respondents train their LLMs with either proprietary corporate, or sensitive/PII data. Representative responses include, “Protecting the security of the training data used to train LLMs is critical, as data breaches can lead to the exposure of sensitive information”, “Data privacy and fine tuning challenges are the most concerning factors…”, and “Confidentiality of the data used and potential for prompt injection to expose information”. One respondent stated the problem at hand as, “LLMs are trained on vast amounts of data, and there's a concern about the privacy of individuals represented in the training data. Striking a balance between effective model training and respecting privacy is a challenge”.
Integrity also garnered strong representation as it was referenced in almost half of the freeform responses. Responses included concerns about insecure output handling, especially for when LLMs are integrated into a broader ecosystem and have more areas of compromise. Sample responses include, “Mostly the security risks that need to be addressed by developers and users, such as prompt injection, training data poisoning, model theft, sensitive information disclosure, and model denial of service”, and “Top concern is Denial of Service attacks bringing the LLM down”.
Participants are also concerned that “LLMs may be vulnerable to adversarial attacks, where input data is intentionally manipulated to mislead the model. Robustness against such attacks is a significant concern for applications where security is paramount.” Similarly, “protecting pre-trained models from adversarial attacks and unauthorized access is vital to prevent misuse or modification of the model's behavior.”
This leads to responses expressing areas of concern regarding the reliability of responses as “LLMs are complex models, and understanding how they arrive at specific decisions or generate certain outputs can be challenging.” These comments are especially strong where bias comes into play. Bias and related ethical challenges open a wider area of study that is only touched upon in this survey yet illustrate the interconnected nature of LLM behavior and the priorities of the developer community. Related commentary reinforcing this conclusion includes: “ensuring the model is unbiased and addresses ethical considerations to avoid perpetuating or amplifying existing biases,” and “bias can inadvertently be learned from training data and may lead to unfair or undesirable results.” Comments not necessarily security specific, still demonstrate the criticality of LLM security maturity across systems, data and governance matters as they address aggregate ethical and societal matters: “There is a concern about ensuring that LLMs are developed and used ethically, avoiding biases and discriminatory behavior in their outputs.”
Conclusions
This research was designed to gain insights into user profiles and preferences related to security benchmarking tools for LLMs. The survey validated key hypotheses of the LLM Canary project and supported a market need for an LLM security testing and/or benchmarking tool; organizations are fine-tuning LLMs with proprietary and sensitive data, and that most developers are still learning about LLMs, and how to safely train and integrate them into apps and services.
- 65% of respondents said they are customizing LLMs
- 64% said they used sensitive or proprietary data to train their LLMs
- 70% stated they have less than 1 year of experience working with LLMs
For testing tools, 26% said they would ideally want to develop a testing harness and tests inhouse and 19% said they would use a pre-built test harness with custom developed tests. The top perceived LLM vulnerabilities were included in the OWASP Top 10 for LLMs (the top 3 were Sensitive Information Disclosure, Insecure Output Handling, Prompt Injections), and the top most used LLM was OpenAI (74%), followed by Google and Meta, where 33% said they work with more than one LLM.
In terms of benchmarking, this is an area that warrants continuous study as the technology and market needs evolve. Benchmarking can play a role in accountability as the awareness and demand for higher safety and integrity of LLM enabled systems increases. Data protection is paramount, as is the manner in which the data is processed using AI, and how AI permeates our ecosystem. Smarter, multimodal testing, more sophisticated and dynamic benchmarking of LLMs, and ultimately guardrails and policies will need to be carefully crafted to prevent bias and harmful societal outcomes.
Authored by Jenn Yonemitsu and Rona Spiegel on behalf of the LLM Canary Project.