What If We Had Just One Developability Parameter?
A conversation with Leon Willis, Ph.D., and David Brockwell, Ph.D., University of Leeds

Among the dozens or even hundreds of manufacturability parameters to consider when developing new drugs, you’re probably looking at stability over time and propensity for aggregation. Drawing a clear picture of those critical quality attributes (CQAs) often takes a tremendous amount of time, material, and tests, which can provide conflicting results.
Researchers at the University of Leeds want to know whether developability assessments can be shaved down to a discrete number of assays or, better yet, a holistic developability parameter. In January, they published a paper in the journal Molecular Pharmaceutics titled Rationalizing mAb Candidate Screening Using a Single Holistic Developability Parameter, in which they claim to show that you can get reassuringly close by measuring fewer variables, which survey varied molecular properties.
Their work, fueled in part by an ongoing relationship with AstraZeneca, was led by Leon Willis, Ph.D., whose previous research into the effects extensional flow has on aggregation inspired this latest research, explored nine antibody formulations with a panel of three monoclonal antibodies in three buffer conditions.
They demonstrated that just five carefully selected assays — thermal stability, colloidal stability, molecular stickiness, and sensitivity to interfacial and hydrodynamic stresses — can predict storage stability at 25 degrees C with high accuracy. And they did it in less than a day using milligram quantities of material, compared to the typical six-month studies, which require hundreds of milligrams.
After promising results, they’re expanding the research to explore more antibodies. To better understand their work and the implications, we asked Willis and his supervisor and the project’s principle investigator, Professor David Brockwell, Ph.D., some questions.
This field typically relies on multiple independent assays, but you decided to pursue a single parameter. Why?
Willis: A biopharmaceutical is a complicated molecule, and there are a lot of different facets to those molecules that allow them to become successful in drugs.
There's what happens in the clinic that ultimately decides whether the drug is safe and efficacious. But in terms of actually manufacturing it at scale, a successful commercial therapeutic needs to possess desirable physical and chemical properties, and those varied properties need to be probed with different methods.
One method can't probe all of those different things. There's been a lot of effort to both develop individual methods and use an array of methods to understand the relationships between properties.
We got into this space having developed an extensional flow device as discussed in our initial conversation to see whether that was a probe of manufacturability. And through collaborating with Adimab, which screened 137 IgGs with 12 assays, we added our device as a 13th assay, and the statistics suggested it was a unique assay.
In our research, we were trying to address all of the above. For example, did the assay relationships change when you have a buffer component changing different excipients, etc., and do you need to do 2,000 tests, or is a minimum set good enough?
Through our approach, we were able to group tests into different families. When you look at which assays are in the families, they tend to probe certain things. To obtain a final holistic developability parameter, we used a statistical test, we identified five assays from three of the four groups that could predict storage at 25 degrees C.
Rather than do the 20 assays we did in the paper, you could start with those five assays and then whittle the number of molecules you have to work with down — should that be what you wish to measure.
Brockwell: Whenever we go to meetings, we have big discussions about what developability actually is. And the key issue is there's no pharmacopeial definition of what developability is and what standard assays have to be passed to be developable.
Using artificial intelligence approaches, lots of people are trying to predict different assays, but then the same problem arises. You can predict the outcomes of these developability assays without doing them. But then what do they actually mean?
We thought that we could simplify the question if we had a defined outcome, which is long-term stability at, say, 5 C or 25 C, which is very difficult to measure and takes a long time and a large sample. Essentially, long-term stability involves colloidal aggregation in its broadest sense which could be due to partial degradation, chemical modification, or agglomeration of the unfolded, partially folded, or native states of the protein. These are all the different features we were measuring with these developability assays, so it seemed to be quite intuitive that the ability to predict long-term stability may be a mixture of the different assays that people measure.
How did you choose the three IgG1 antibodies for your research panel?
Willis: Basically, AstraZeneca had a lot of those molecules. Two of them had known developability issues, so those projects were stopped. For the third, they just had a lot of it. That helped because some of those assays we used were at the later stage of development.
For example, a viscosity analysis needed 131 mgs per mL sample. In contrast, for the current project we're working on, which followed that research, we have 20 mgs total. We were able to do a lot of tests because we had a lot of material, basically.
Given that the panel was in later development stages, how do you envision this method being integrated into selection workflows for less developed candidates?
Brockwell: That's where we think it would probably fit in really nicely. Essentially, we want to de-risk. We want to identify molecules that aren’t going to pass long-term stability. Say you have a lead panel of 20 molecules, which we can express at a tens-of-milligrams scale and do these five assays and just focus on the ones that are going to cut the mustard.
You concluded with 25 C storage stability as your manufacturability parameter. Can you walk us through the steps it took to reach that metric?
Willis: Ideally, we'd have predicted 5 degrees, the gold standard. To set the shelf life for the drug, you've got to measure it in real time. That can be for two or three years. Obviously, if it goes wrong after two years, that’s a bit of a financial disaster for the drug company.
Lots of people have developed models to do experiments at high temperature and use the rates there to predict what happens at low temperatures, but that’s only if your antibody exhibits certain kinetic regimes. Not all of them do.
We thought perhaps, based on our initial analysis with our flow device, for example, the rates of degradation and aggregation in that device match what we saw in the fridge. The molecules just aggregate faster in the flow device by the same pathways as they do in a fridge. When we looked at the 5-degree data, the rates were very slow and were statistically insignificant, so we couldn’t use it as our Y axis in the final test.
Brockwell: But essentially, we think that because the IgGs provided by AstraZeneca, were probably far down the development pipeline, all of them were relatively developable. And so that really goes back to the previous answer about saying, yes, we think this assay or family of assays might be reemployed in earlier development where we get a big difference between the good molecules and the bad molecules.
Willis: So, 25 C was the next best thing we could look at because it took six months of leaving things in an incubator and measuring them every so often.
Brockwell: And even though it's used extensively in industry — accelerated stability at 40 degrees — I think there's poor evidence of how well it correlates with long-term stability. You are getting closer to the unfolding pathway where things aggregate because they've unfolded — but at lower temperatures, proteins may aggregate because they’ve undergone these partial unfolding events exposing a little bit of hydrophobic core, which is usually hidden.
Your approach reduces the 18-month, material-intensive stability study to a one-day assessment using milligram quantities. What barriers prevent industry adoption of this framework? Is this easily transferable to a commercial operation outside of the academic lab?
Willis: In theory, yes, because you’re doing a lot of the assays anyway. We're also aware that certainly some academic labs, and to my knowledge two biopharmaceutical companies, have developed their own extensional flow devices in the past couple of years.
On the other hand, we didn’t set any sort of thresholds for good or bad in the paper. You process the numbers, and you end up with a number at the end. Industry tends to set thresholds for good or bad molecules, often based on experience.
They'll say, for example, the melting temperature must be 70 degrees, and if it's any lower, it's bad. And there are these thresholds they've set based on the fact they've measured thousands of their own molecules. One company might have one number, whereas this may change for another company depending on what molecules they've measured.
Brockwell: One of the limitations of the study is that it was three molecules. We are trying to address that in a larger study we're doing now with 42 molecules. We've found a correlation for three mAbs in three buffers, but does that work for wider array of sequences? Does it work for different formats, different buffers?
I think it probably needs a larger, more robust data set. But even if we had that, I think it would be very difficult to persuade lots of people to believe it without doing some pilot studies for themselves, but again, that’s difficult — these molecules are expensive.
What other CQAs are good candidates for this approach?
Willis: We argue, because this is a mathematical framework, it shouldn't matter what you put on that Y axis. It will either correlate or it won't, and it is just processing data, right?
Brockwell: You can calculate your experiments’ correlation with anything you like and ask which mixes of developability assays give you the best correlation. But I suppose then it leaves it up to the end user to decide which CQA to choose. I suppose the answer is: the one you find most difficult to measure. Then, you need to collect this reference once and then ask: “Which mixture of my developability assays would allow me to predict that?”
Willis: For example, viscosity is a nightmare to predict. Likewise, solubility limit, which is related, but it’s not exactly the same. Some people might wonder what the highest concentration mAb formulation theoretically could be, and they're aiming for something like 200 mg per mL plus formulation. Is that even feasible routinely? Predicting something like that would be interesting.
Editor’s note: Leon Willis and David Brockwell wish to acknowledge project funding from U.K. Research and Innovation’s EPSRC program and AstraZeneca.
About The Experts:
Leon Willis, Ph.D., is a postdoctoral researcher at the University of Leeds’ Astbury Centre for Structural Molecular Biology in the School of Molecular and Cellular Biology. Since 2014, his work has focused on how hydrodynamic forces can damage proteins, specifically, monoclonal antibodies. He studied biological chemistry at the University of Sheffield and obtained his Ph.D. at Leeds.
David Brockwell, Ph.D., is a biochemistry and molecular biology professor at the University of Leeds’ Astbury Centre for Structural Molecular Biology in the School of Molecular and Cellular Biology. His research explores the effects that force has on proteins and membrane protein folding and folding factors with a focus on biopharmaceutical manufacturing. He obtained his B.Sc. and his Ph.D. at the University of Manchester.