Unpacking Pfizer's Advanced Process Control Platform For Upstream Ops
A conversation with Shu Yang, Ph.D., and Edwin Shen, Ph.D. — Pfizer

Off-the-shelf process analytical technology (PAT) wasn’t cutting it, and their vendors couldn’t support what they were trying to achieve, so Pfizer scientists developed an utterly bespoke bioreactor control platform for upstream process development.
Senior Scientist Shu Yang, Ph.D., and Principal Scientist Edwin Shen have started deploying their custom-built advanced process control (APC) platform real-time control in select areas of the company’s vast benchtop bioreactor ecosystem.
The tool gives modelers a heightened level of control and dexterity compared to default or vendor-provided controllers, thus speeding up process development. The duo offered to share insights from project and answer questions. Here’s what they told us.
What upstream process pain points inspired this work? What couldn’t existing tools or controls handle?
Pfizer’s Upstream Process Development is pioneering the integration of machine learning (ML), PAT, and hybrid process model (AI+mechanistic) into real-time bioreactor control, known as APC. However, several pain points with existing tools and infrastructure necessitated a new approach:
- Customization limitations: Pfizer uses unique control schemes. Vendors are often unwilling to modify their systems to accommodate these custom needs.
- Incompatibility with ML and modeling tools: Traditional automation platforms like DeltaV and PLCs lack the mathematical libraries and flexibility needed for ML and model-driven APCs.
- Need for modularity: Pfizer operates hundreds of bench-scale bioreactors from various vendors. A modular system is essential for both high-throughput development and seamless scale-up to manufacturing.
- Agility and speed: Scientists and modelers frequently need to iterate on APC logic based on experimental feedback. Vendor systems and traditional automation teams cannot support the rapid turnaround required.
- Cost and efficiency: In-house development reduces costs and accelerates deployment compared to outsourcing.
- Digital bottlenecks: Integrating digital twins and advanced analytics into legacy systems is difficult due to outdated infrastructure and limited scientist bandwidth. Scientists often juggle modeling with experimental responsibilities, leaving little time for deployment efforts.
- ML infrastructure gaps: The current automation stack is outdated and disconnected from the data science tech stack, making ML deployment particularly challenging.
We have therefore created an in-house APC platform that exists as a distributed control layer on top of our traditional bioreactor controllers. This design enables us to circumvent the technical challenges of modifying vendor bioreactor controllers. Furthermore, we have significantly reduced the barrier to prototype new APCs, unlocking the full creativity and speed of our world-class scientists and modelers.
What led you to build this in-house rather than license or partner?
The decision to build in-house was driven by the need for:
- Full control and flexibility: Customize APCs to Pfizer’s unique processes without vendor constraints.
- Modular, scalable architecture: Develop a system that can adapt to diverse equipment and scale from lab to manufacturing.
- Empowerment of scientists and modelers: Provide tools that align with their workflows and allow them to focus on science, not infrastructure.
- Accelerate research and development: Enable scientists and modelers to validate APCs without a lengthy vendor customization process.
- Protect IP: Contain information of new R&D initiatives fully within Pfizer.
- Cost-effectiveness: Avoid the high costs and delays associated with vendor customization and outsourcing.
- Digital transformation alignment: Support Pfizer’s broader push toward AI, digital twins, and smart manufacturing.
Vendor solutions are often built around traditional factory automation models, which don’t align with the evolving needs of modern bioprocess development. As technology becomes more central to process optimization, we’ve made in-house automation a core capability.
We’re seeing a strong push and pull dynamic:
- Pull: Our process development teams are constantly seeking gains in productivity and consistency. Real-time analytics—like PAT, modeling, and APCs—are essential for precise control of cell culture.
- Push: The new generation of scientists brings strong modeling and programming skills. This has led to greater integration of modeling across teams.
Bottleneck: integrating advanced analytics and digital twins with vendor control systems is difficult. Specifically for machine learning, the gap is even wider. The challenge is twofold:
- Technology gap: Existing automation infrastructure isn’t designed to support modern digital tools, making integration costly and complex.
- People burden: Scientists still carry heavy responsibilities in experiment execution and process development, leaving little bandwidth to address the technology gap.
Our solution was to embed modern IT workflows into our team, strengthening our digital twin capabilities and empowering our modeling experts. We focused on two pillars: technology and people.
- Technology: We built a Python-native deployment layer on top of our distributed control system (DCS), enabling seamless model execution and communication with vendor systems. We use containerization and orchestration to ensure smooth tech transfer—from a scientist’s laptop to the automation system.
- People: We adopted DevOps/MLOps practices like version control and CI/CD. This allows scientists to collaborate on large projects and deploy changes without needing to interact directly with the control infrastructure.
What this enables:
- Flexibility: Any model or logic can be encoded and deployed.
- Scalability: The system has been connected to more than 150 bioreactors.
- Agility: A Python function on a laptop can be deployed to DCS within two hours. It empowered more than 20 scientists to integrate their model for real-time running and control, and the number is growing.
What does your smart APC platform consist of? Is it homegrown software, layered on top of existing systems, or entirely custom-built?
The platform is homegrown and layered on top of existing systems, specifically DeltaV. Key components include:
- Communication layer: It uses OPC UA to interface with DeltaV and other vendor systems.
- Application layer: It executes APC logic written in Python, allowing for modular and flexible control strategies. This also includes hardware abstractions, where the same APC logic can control different types of bioreactors at different facilities.
- Containerization: Models and platform components are deployed as Docker containers orchestrated by Kubernetes, enabling portability, scalability, and fault-tolerance.
- Modular design: This separates hardware integration (via Kepware), platform infrastructure (Docker/Kubernetes), application logic (Python), and hardware abstractions (Python), allowing each layer to evolve independently.
MLOps is still a novel concept in biomanufacturing. What does it look like in your setting? How do you handle model versioning, retraining, and performance monitoring in a GMP context?
Although the current implementation is non-GMP, the platform incorporates MLOps principles that align with GMP goals:
- Version control: All code and data are versioned, ensuring traceability and reproducibility.
- CI/CD pipelines: Automated integration and deployment workflows allow scientists to focus on logic rather than infrastructure.
- Pipeline architecture: Model development is structured as reproducible pipelines, though evaluation and retraining are still manual due to the need for SME input.
- Model registry: All models and Docker images are stored in a centralized registry with clear versioning.
- Infrastructure as code: The entire infrastructure is managed through code, enabling reproducibility and scalability.
These practices not only streamline development but also lay the groundwork for future GMP validation by enforcing traceability, reproducibility, and automation.
Walk us through how a process scientist (or modeler) interacts with the system. What do they see? What do they control? What does the AI handle?
Our system provides a streamlined experience for scientists to operate APCs and modelers to deploy APCs. In the development scale, we support two target audiences – scientists and modelers.
Scientists who use APCs in their bioreactor experiments will interact with a user-friendly graphical interface. This user interface allows the scientist to:
- Seamlessly deploy APCs to any combination of bioreactors, pumps, and PAT in their development space
- Monitor trends and alarms of process variables in real time
- Adjust APC control parameters post-deployment.
The APC handles second-to-second control of the bioreactor. The logic and AI/models for the APCs are designed by the scientists and modelers
Another key experience of our system is that we can quickly turn around feedback from our scientists. When developing new APCs, a common trend that we observe is that scientists commonly request modifications after one or two experiments. To unlock the full speed of their development efforts, we have designed a platform that simplifies new APC development.
Modelers: the system offers a modular architecture and CI/CD-enabled workflows that eliminate the typical IT overhead. They no longer need to worry about infrastructure, communication protocols, or deployment logistics. Instead, they can focus entirely on what they do best: building and refining models. With seamless integration into our automation system, a Python function developed on a laptop can be deployed to production within hours.
What types of real-time sensors (Raman, NIR spectroscopy, etc.) are you using, and how are you integrating that data in real time?
Our system is currently integrated with PAT. It is also integrated with the automated at-line sampling analyzer, the Bioprofile Flex 2. The system is future-proofed with its ability to integrate with any equipment that supports a communication interface.
Our equipment integration layer uses Kepware servers to streamline connectivity and convert legacy communication protocols to modern high-throughput OPC UA communications. Combined with our network-based platform layer, this method enables remote, distributed deployment of our system to Pfizer facilities all around the world.
Data is streamed into our application layer in real time, where APC logic is executed to calculate a low-latency, real-time response. For example, our PAT, analyzer, and more traditional online data are often processed through a model to drive dynamic nutrient and glucose feed rates. These dynamic processes have demonstrated promising results with improved titers, lactate control, and process robustness.
What stands in the way of deploying this in a commercial setting?
Deployment to commercial manufacturing will be a lengthy journey.
Initially, we will need to prove strong use cases for PAT, model, and ML-based APCs. Process improvements enabled by these APCs would need to be significant enough to justify the adoption of infrastructure and increased complexity. These efforts begin in the development labs where scientists would first need to adopt this digital mindset and tune APCs as a part of their scientific endeavors. Pfizer’s upstream process development has strong experience in this area with scientists who are accustomed to designing real-time control.
Our platform is currently progressing with the adoption and deployment process at several development sites across Pfizer. This paves the way for adoption in commercial, as these development facilities will be enabled to support process validation studies and satellite runs with analogous infrastructure. Pfizer-wide adoption of the platform also smooths the technology transfer process and reduces custom engineering efforts when APCs are translated to manufacturing scale.
Ultimately, using these APCs in the development of a specific molecule will drive adoption in the commercial setting. As we demonstrate the scientific effectiveness of APCs and the engineering effectiveness of our in-house platform, there will inevitably be an opportunity to use this technology in a portfolio project. Our platform will follow the molecule as it is tech transferred to different manufacturing facilities.
About The Experts:
Shu Yang is a manager at Pfizer. He conducts research in process modeling, artificial intelligence, process optimization and control systems. He received an M.Sc. from Rutgers University — New Brunswick and his Ph.D. from Rensselaer Polytechnic Institute.
Edwin Shen is a principal scientist at Pfizer. He conducts upstream process development research where he specializes in advanced control. He received his Ph.D. in biological engineering and small-scale technologies from the University of California Merced.