Unstructured has a critical vulnerability (CVE-2025-64712).87% of Fortune 1000 companies, including Amazon, Google, and Bank of America, rely on io, an ETL library, to process AI data. This path traversal vulnerability, which has been given a CVSS score of 9.8, allows for arbitrary file writes and possibly remote code execution on systems that are using the library. Insecure use of Microsoft Outlook is the root of the problem.msg attachments, giving hackers the ability to replace important files like SSH authorized_keys.
Unstructured.io converts unstructured data into AI-ready formats, including PDFs, emails, and images, which make up 80–90% of enterprise data. Unstructured information (source: cyera) Its GitHub open-source library uses chunking, embedding, and extraction to process documents for vector databases.
It is used by businesses in conjunction with managed SaaS APIs and a no-code platform that integrates with Salesforce, Google Drive, OneDrive, and S3. It is wrapped by well-known frameworks like LlamaIndex and LangChain, which increase the blast radius across millions of deployments, including OpenWebUI. Path traversal in partition_msg() via CVE ID CVSS Score Description Affected Versions Patch Status CVE-2025-64712 9.8 (Critical).By using attacker-controlled filenames (such as../../root/.ssh/authorized_keys), msg attachments enable arbitrary file write access, which can result in RCE through webshells, cron jobs, or startup script injection.
Network is the attack vector (low complexity, no impact or privileges needed). Check GitHub for all versions before the most recent commit. A patch is available; update right away. The partition_msg function, which is called by MsgPartitioner, conceals the defect.Email elements are handled by iter_message_elements.
For attachments, AttachmentPartitioner.iter_elements stores them in /tmp/ by blindly concatenating the temp directory with self._attachment_file_name the original, unvalidated filename from the .msg file. According to Cyera, an attacker crafts an attachment named ../../etc/passwd or similar, writing malicious content anywhere on the filesystem. This escalates to full server compromise, enabling data exfiltration, credential theft, or lateral movement.
With over 4 million monthly downloads and nested dependencies in ~100K GitHub files via LangChain, this supply chain risk defies easy tracking. Cloud giants reference Unstructured in Azure, AWS, and GCP docs, embedding it in production AI pipelines. Organizations should audit dependencies, upgrade the library from GitHub, and scan for .msg inputs in untrusted sources. CISA and vendors urge immediate patching to avert RCE in enterprise environments.
Make ZeroOwl your Google Preferred Source.


.webp&w=3840&q=75)









.webp%3Fw%3D1068%26resize%3D1068%2C0%26ssl%3D1&w=3840&q=75)