Data Analysis

Cool Calculation

Marcin Chrząszcz, a physicist who does research into exotic elementary particles at CERN, organized a competition for data scientists. The best solutions were recently presented at the UZH Department of Physics.

Stefan Stöcklin

Datenanalyst Marcin Chrząszcz
“Almost undetectable”: physicist and data analyst Marcin Chrząszcz works with extremely rare decays of subatomic particles. (Image: sts)

At first glance, particle physicist Marcin Chrząszcz – pronounced “Shratsh” – seems to fit many of the clichés you hear about physicists: he appears intelligent and introverted, and thinks very carefully before answering your questions. But as soon as the discussion moves to his field of research, decays of subatomic elementary particles, the calm scientist gets passionate.

His voice is strong as he explains the essence of his work, using his arms to emphasize what he says: “I work on decays of subatomic particles such as B mesons and tau leptons.” Unstable B mesons contain quarks and antiquarks, and are part of the standard model of particle physics. Also part of the standard model are leptons, which include stable particles such as electrons, and unstable tauons (tau leptons).

Real and hypothetical decays of this type of particle were the theme of a recent symposium on data mining at the UZH Department of Physics which Chrząszcz helped organize. Now he really gets into his swing: “My colleagues presented methods for the statistically significant detection of extremely rare decay events.”

LHC detectors

Marcin Chrząszcz is a postdoc in Ueli Straumann’s research group, and spends most of his time working at CERN near Geneva. The tool he uses for his research is the particle accelerator LHC (Large Hadron Collider), or more precisely the LHCb detector, one of several detectors that are part of the LHC.

Three years ago the particle accelerator hit the headlines by detecting the Higgs boson for the first time. While this was done in two huge house-sized detectors, CMS and ATLAS, researchers use the smaller LHCb to investigate less heavy particles that decay into quarks and antiquarks.

Measuring rare decays in the LHCb detector is all in a day’s work for the Polish scientist. He and his colleagues had the idea of making aspects of this field of research into the subject of an international Kaggle competition for data analysis. In a Kaggle competition, the community of data scientists is posed a interesting question, and present their solutions in a competition (see box on Kaggle competitions).

Extremely rare decay

“We put out hypothetical decays of B mesons and tau leptons for discussion,” explains Marcin Chrząszcz. In concrete terms, the challenge was to develop a program that had to be able to identify any signals from real and hypothetical data.

“The idea was to detect an extremely rare decay that we don’t even know exists,” says the particle physicist. At 10-40 (the figure 1 forty places to the right of the decimal point), the probability of these decays occurring is unimaginably small.

But something that sounds purely academic and utterly divorced from reality is actually a hot area of research. Because it’s basically about recognizing clear patterns in huge datasets, in other words filtering signals out of the noise. Many applications in our increasingly digital world are based on this ability of algorithms to trawl through data and interpret them correctly. Face recognition and self-driving cars are cases in point.

Hot topic

“The field of data analysis and pattern recognition is exploding,” says Marcin Chrząszcz. This explains the huge success of the Kaggle competition that he initiated: 673 teams from all over the world took on the scientific challenge. The winner was a team headed by Russian data scientist Alexander Guschin. He presented his solution at a recent workshop on data mining and the LHCb held at the UZH Department of Physics.

His proposal, something that non-physicists will find very hard to understand, garnered huge applause at the event in Zurich, reports Marcin Chrząszcz enthusiastically. The fact that most of the competitors came from eastern Europe and Russia, he says, is pure coincidence. And the fact that he’s from Poland is only a widespread cliché about mathematically gifted eastern Europeans.

Competition to find the smartest solution

Kaggle is a public website run by data scientists in San Francisco who publish questions on the portal with the invitation to compete to find the best solution. Experts – or laypeople – can describe their problem and upload the necessary data. The only condition is that the challenge be approved by a team of experts. The review costs USD 10,000, but after that there are no restrictions. The first three winners receive a prize. Since Kaggle was launched in 2010 there have been many competitions to find answers to questions related to physics, medicine, astronomy, and culture. Some of the competitions running at the moment, for example, involve diagnosing heart disease using CT data, face recognition, and bank payments.

Stefan Stöcklin, Editor UZH News

Write Comment

The editorial team reserves the right to not publish comments. We will not publish anonymous, defamatory, racist, sexist, otherwise prejudiced, or irrelevant comments. UZH News will also not publish comments with advertising content.

Number of remaining characters: 1000