# Information Processing

Just another WordPress.com weblog

## It’s all in the clusters

Nicholas Wade in the Times on genetic clustering, differential selection and race. See here for previous discussion of a metric on the space of genomes and the scientific meaning of race.

I hear the voice of physicist turned evolutionary biologist Greg Cochran in Wade’s writing 🙂

Wade: Historians [and social scientists!] often assume that they need pay no attention to human evolution because the process ground to a halt in the distant past. That assumption is looking less and less secure in light of new findings based on decoding human DNA.

People have continued to evolve since leaving the ancestral homeland in northeastern Africa some 50,000 years ago, both through the random process known as genetic drift and through natural selection. The genome bears many fingerprints in places where natural selection has recently remolded the human clay, researchers have found, as people in the various continents adapted to new diseases, climates, diets and, perhaps, behavioral demands. …

Cochran: There is something new under the sun — us.

Thucydides said that human nature was unchanging and thus predictable — but he was probably wrong. If you consider natural selection operating in fast-changing human environments, such stasis is most unlikely. We know of a number of cases in which there has been rapid adaptive change in humans; for example, most of the malaria-defense mutations such as sickle cell are recent, just a few thousand years old. The lactase mutation that lets most adult Europeans digest ice cream is not much older.

There is no magic principle that restricts human evolutionary change to disease defenses and dietary adaptations: everything is up for grabs. Genes affecting personality, reproductive strategies, cognition, are all able to change significantly over few-millennia time scales if the environment favors such change — and this includes the new environments we have made for ourselves, things like new ways of making a living and new social structures. I would be astonished if the mix of personality types favored among hunter-gatherers is “exactly” the same as that favored among peasant farmers ruled by a Pharaoh. In fact they might be fairly different.

(Larger version here.)

From Dienekes:

A new article in BMC Genomics discusses the issue of predicting continental origin using randomly selected markers. The pdf is freely available.

One of the arguments of those who deny the existence of biological races is that their reality is subjective. Some extremists have argued that race is totally socially constructed; this is, however, disproven by the fact that socially constructed race is correlated with physical characteristics. Thus, rather than being separated from biology, the social phenomenon of race is rooted in biology.

A different argument holds that race is correlated with biology, but the differences are “skin-deep”, i.e., involve only superficial, visible, (and by some strange logic unimportant) characteristics. According to the proponents of this view, the idea of biological race places an undue emphasis on a set of traits: it is a result of the subjective choice of a set of traits as race-defining. Thus, the commonly recognized races of traditional physical anthropology are discounted as subjective organizations of the biological data: we could just as simply speak of a “lactose-intolerant race” according to this view.

In forensic science and admixture analysis scientists often discover and use polymorphisms which exhibit large inter-population differences. Decoding DNA isn’t free, thus, it makes sense to use the most informative, most “biased” markers when one is trying to discover the origin of a biological sample. For example, if Africans have 55% of gene version A and 45% of gene version B, and Europeans have 53% of A and 47% of B, it makes little sense to type this particular gene, since it cannot really tell us whether a sample is European or African. A gene where Africans have 90% of A while Europeans have 5%of A would be much more useful. Race skeptics claim, as with the physical anthropological data, that to privilege such carefully chosen genes is to stress the differences between groups; the implication is that in randomly chosen genes these differences are minor.

The new paper is one of many (you can click on the Clusters label to find more) recent papers that have discovered that no matter what genetic markers you choose: SNPs, STRs, no matter how you choose them: randomly or based on their “informativeness”, it is relatively easy to classify DNA into the correct continental origin. Depending on the marker types (e.g., indel vs. microsatellite), and their informativeness (roughly the distribution differences between populations), one may require more or less markers to achieve a high degree of accuracy. But, the conclusion is the same: after a certain number of markers, you always succeed in classifying individuals according to continental origin.

Thus, the emergent pattern of variation is not at all subjectively constructed: it does not deal specifically with visible traits (randomly chosen markers could influence any trait, or none at all), nor does it privilege markers exhibiting large population differences. The structuring of humanity into more or less disjoint groups is not a subjective choice: it emerges naturally from the genomic composition of humans, irrespective of how you study this composition. Rather than proving that race is skin-deep, non-existent, or unimportant, modern genetic science is both proving that it is in fact existent, but also sets the foundation for the study of its true importance, which is probably somewhere in between the indifference of the sociologists and the hyperbole of the racists.

Written by infoproc

June 27, 2007 at 2:59 am

Posted in evolution, genetics

## It’s all in the clusters

Nicholas Wade in the Times on genetic clustering, differential selection and race. See here for previous discussion of a metric on the space of genomes and the scientific meaning of race.

I hear the voice of physicist turned evolutionary biologist Greg Cochran in Wade’s writing 🙂

Wade: Historians [and social scientists!] often assume that they need pay no attention to human evolution because the process ground to a halt in the distant past. That assumption is looking less and less secure in light of new findings based on decoding human DNA.

People have continued to evolve since leaving the ancestral homeland in northeastern Africa some 50,000 years ago, both through the random process known as genetic drift and through natural selection. The genome bears many fingerprints in places where natural selection has recently remolded the human clay, researchers have found, as people in the various continents adapted to new diseases, climates, diets and, perhaps, behavioral demands. …

Cochran: There is something new under the sun — us.

Thucydides said that human nature was unchanging and thus predictable — but he was probably wrong. If you consider natural selection operating in fast-changing human environments, such stasis is most unlikely. We know of a number of cases in which there has been rapid adaptive change in humans; for example, most of the malaria-defense mutations such as sickle cell are recent, just a few thousand years old. The lactase mutation that lets most adult Europeans digest ice cream is not much older.

There is no magic principle that restricts human evolutionary change to disease defenses and dietary adaptations: everything is up for grabs. Genes affecting personality, reproductive strategies, cognition, are all able to change significantly over few-millennia time scales if the environment favors such change — and this includes the new environments we have made for ourselves, things like new ways of making a living and new social structures. I would be astonished if the mix of personality types favored among hunter-gatherers is “exactly” the same as that favored among peasant farmers ruled by a Pharaoh. In fact they might be fairly different.

(Larger version here.)

From Dienekes:

A new article in BMC Genomics discusses the issue of predicting continental origin using randomly selected markers. The pdf is freely available.

One of the arguments of those who deny the existence of biological races is that their reality is subjective. Some extremists have argued that race is totally socially constructed; this is, however, disproven by the fact that socially constructed race is correlated with physical characteristics. Thus, rather than being separated from biology, the social phenomenon of race is rooted in biology.

A different argument holds that race is correlated with biology, but the differences are “skin-deep”, i.e., involve only superficial, visible, (and by some strange logic unimportant) characteristics. According to the proponents of this view, the idea of biological race places an undue emphasis on a set of traits: it is a result of the subjective choice of a set of traits as race-defining. Thus, the commonly recognized races of traditional physical anthropology are discounted as subjective organizations of the biological data: we could just as simply speak of a “lactose-intolerant race” according to this view.

In forensic science and admixture analysis scientists often discover and use polymorphisms which exhibit large inter-population differences. Decoding DNA isn’t free, thus, it makes sense to use the most informative, most “biased” markers when one is trying to discover the origin of a biological sample. For example, if Africans have 55% of gene version A and 45% of gene version B, and Europeans have 53% of A and 47% of B, it makes little sense to type this particular gene, since it cannot really tell us whether a sample is European or African. A gene where Africans have 90% of A while Europeans have 5%of A would be much more useful. Race skeptics claim, as with the physical anthropological data, that to privilege such carefully chosen genes is to stress the differences between groups; the implication is that in randomly chosen genes these differences are minor.

The new paper is one of many (you can click on the Clusters label to find more) recent papers that have discovered that no matter what genetic markers you choose: SNPs, STRs, no matter how you choose them: randomly or based on their “informativeness”, it is relatively easy to classify DNA into the correct continental origin. Depending on the marker types (e.g., indel vs. microsatellite), and their informativeness (roughly the distribution differences between populations), one may require more or less markers to achieve a high degree of accuracy. But, the conclusion is the same: after a certain number of markers, you always succeed in classifying individuals according to continental origin.

Thus, the emergent pattern of variation is not at all subjectively constructed: it does not deal specifically with visible traits (randomly chosen markers could influence any trait, or none at all), nor does it privilege markers exhibiting large population differences. The structuring of humanity into more or less disjoint groups is not a subjective choice: it emerges naturally from the genomic composition of humans, irrespective of how you study this composition. Rather than proving that race is skin-deep, non-existent, or unimportant, modern genetic science is both proving that it is in fact existent, but also sets the foundation for the study of its true importance, which is probably somewhere in between the indifference of the sociologists and the hyperbole of the racists.

Written by infoproc

June 27, 2007 at 2:59 am

Posted in evolution, genetics

## It’s all in the clusters

Nicholas Wade in the Times on genetic clustering, differential selection and race. See here for previous discussion of a metric on the space of genomes and the scientific meaning of race.

I hear the voice of physicist turned evolutionary biologist Greg Cochran in Wade’s writing 🙂

Wade: Historians [and social scientists!] often assume that they need pay no attention to human evolution because the process ground to a halt in the distant past. That assumption is looking less and less secure in light of new findings based on decoding human DNA.

People have continued to evolve since leaving the ancestral homeland in northeastern Africa some 50,000 years ago, both through the random process known as genetic drift and through natural selection. The genome bears many fingerprints in places where natural selection has recently remolded the human clay, researchers have found, as people in the various continents adapted to new diseases, climates, diets and, perhaps, behavioral demands. …

Cochran: There is something new under the sun — us.

Thucydides said that human nature was unchanging and thus predictable — but he was probably wrong. If you consider natural selection operating in fast-changing human environments, such stasis is most unlikely. We know of a number of cases in which there has been rapid adaptive change in humans; for example, most of the malaria-defense mutations such as sickle cell are recent, just a few thousand years old. The lactase mutation that lets most adult Europeans digest ice cream is not much older.

There is no magic principle that restricts human evolutionary change to disease defenses and dietary adaptations: everything is up for grabs. Genes affecting personality, reproductive strategies, cognition, are all able to change significantly over few-millennia time scales if the environment favors such change — and this includes the new environments we have made for ourselves, things like new ways of making a living and new social structures. I would be astonished if the mix of personality types favored among hunter-gatherers is “exactly” the same as that favored among peasant farmers ruled by a Pharaoh. In fact they might be fairly different.

(Larger version here.)

From Dienekes:

A new article in BMC Genomics discusses the issue of predicting continental origin using randomly selected markers. The pdf is freely available.

One of the arguments of those who deny the existence of biological races is that their reality is subjective. Some extremists have argued that race is totally socially constructed; this is, however, disproven by the fact that socially constructed race is correlated with physical characteristics. Thus, rather than being separated from biology, the social phenomenon of race is rooted in biology.

A different argument holds that race is correlated with biology, but the differences are “skin-deep”, i.e., involve only superficial, visible, (and by some strange logic unimportant) characteristics. According to the proponents of this view, the idea of biological race places an undue emphasis on a set of traits: it is a result of the subjective choice of a set of traits as race-defining. Thus, the commonly recognized races of traditional physical anthropology are discounted as subjective organizations of the biological data: we could just as simply speak of a “lactose-intolerant race” according to this view.

In forensic science and admixture analysis scientists often discover and use polymorphisms which exhibit large inter-population differences. Decoding DNA isn’t free, thus, it makes sense to use the most informative, most “biased” markers when one is trying to discover the origin of a biological sample. For example, if Africans have 55% of gene version A and 45% of gene version B, and Europeans have 53% of A and 47% of B, it makes little sense to type this particular gene, since it cannot really tell us whether a sample is European or African. A gene where Africans have 90% of A while Europeans have 5%of A would be much more useful. Race skeptics claim, as with the physical anthropological data, that to privilege such carefully chosen genes is to stress the differences between groups; the implication is that in randomly chosen genes these differences are minor.

The new paper is one of many (you can click on the Clusters label to find more) recent papers that have discovered that no matter what genetic markers you choose: SNPs, STRs, no matter how you choose them: randomly or based on their “informativeness”, it is relatively easy to classify DNA into the correct continental origin. Depending on the marker types (e.g., indel vs. microsatellite), and their informativeness (roughly the distribution differences between populations), one may require more or less markers to achieve a high degree of accuracy. But, the conclusion is the same: after a certain number of markers, you always succeed in classifying individuals according to continental origin.

Thus, the emergent pattern of variation is not at all subjectively constructed: it does not deal specifically with visible traits (randomly chosen markers could influence any trait, or none at all), nor does it privilege markers exhibiting large population differences. The structuring of humanity into more or less disjoint groups is not a subjective choice: it emerges naturally from the genomic composition of humans, irrespective of how you study this composition. Rather than proving that race is skin-deep, non-existent, or unimportant, modern genetic science is both proving that it is in fact existent, but also sets the foundation for the study of its true importance, which is probably somewhere in between the indifference of the sociologists and the hyperbole of the racists.

Written by infoproc

June 27, 2007 at 2:59 am

Posted in evolution, genetics

## Rise of the money machines

As far as I could tell there was only one other physicist at foo camp, and he is CTO and head quant at a big hedge fund. He and Tim O’Reilly ran a discussion called Rise of the Money Machines 🙂

From the Economist’s review of the new Peter Bernstein book Capital Ideas Evolving (sequel to Capital Ideas, which was quite good).

Economist: …Indeed, Mr Bernstein seeks to show how financial giants such as Barclays Global Investors and Goldman Sachs Asset Management have built on the insights developed by the academics. If there are ways systematically to beat the markets these days, they probably require men with physics doctorates and massive computer power rather than a smooth manner and the right contact book.

There is the equivalent of a technological arms race as modern fund managers vie to find the best computer models and to trade quickly before their competitors spot the same opportunities. …

Written by infoproc

June 26, 2007 at 3:29 am

## Curved space and monsters

New paper!

http://arxiv.org/abs/0706.3239

A simple question: how many different black holes can there be with mass M? Conventional wisdom: of order exp(A), where A is the surface area of the hole and scales as M^2.

Using curved space, we construct objects of ADM mass M with far more than exp(A) microstates. These objects have pathological properties, but, as far as we can tell, can be produced via quantum tunneling from ordinary (non-pathological) initial data. Our results suggest that the relation between black hole entropy and the number of microstates of the hole is more subtle than perhaps previously appreciated.

Update! Rafael Sorkin was kind enough to inform us of his earlier related work with Wald and Zhang. We’ve added the following end-note to the paper.

Note added: After this work was completed we were informed of related results obtained by Sorkin, Wald and Zhang [25]. Those authors investigated monster-like objects as well as local extrema of the entropy S subject to an energy constraint, which correspond to static configurations and obey $A^{3/4}$ scaling. For example, in the case of a photon gas the local extrema satisfy the Tolman–Oppenheimer–Volkoff equation of hydrostatic equilibrium. In considering monster configurations, Sorkin et al. show that requiring a configuration to be no closer than a thermal wavelength $\lambda \sim \rho^{-1/4}$ from its Schwarzschild radius imposes the bound $S A^{3/4}$ are already black holes in the sense that the future of parts of the object does not include future null infinity.

Black hole entropy, curved space and monsters

Stephen D.H. Hsu, David Reeb

(Submitted on 21 Jun 2007)

We investigate the microscopic origin of black hole entropy, in particular the gap between the maximum entropy of ordinary matter and that of black holes. Using curved space, we construct configurations with entropy greater than their area in Planck units. These configurations have pathological properties and we refer to them as monsters. When monsters are excluded we recover the entropy bound on ordinary matter $S < A^{3/4}$. This bound implies that essentially all of the microstates of a semiclassical black hole are associated with the growth of a slightly smaller black hole which absorbs some additional energy. Our results suggest that the area entropy of black holes is the logarithm of the number of distinct ways in which one can form the black hole from ordinary matter and smaller black holes, but only after the exclusion of monster states.

## Foo camp impressions

Beautiful weather, but very cold at night.

Too much to absorb!

Larry Page came by helicopter. Jimmy Wales is in a tent like the rest of us. Dirac’s grandson is here. Data mining is big. Hedge funds are on the mind, even here. Lots of people working on Web 2.0 and collective intelligence. Facebook’s new platform is impressive. MySpace and Facebook in a technology race. Biobricks? Founders of Red Hat, Skype, Wikipedia, Amazon, MoveOn, Digg, Delicious, BitTorrent…

Report and video from WSJ reporter Kara Swisher about the event. I’m at the end of the video, explaining the game “werewolf” to Kara.

Sadly, I was not there to see Google’s Larry Page land a helicopter on the lawn up at Tim O’Reilly’s annual geekfest called Foo Camp, so I could mock him to his face.

(Note to Larry: That better be an awfully big solar footprint you’re building at the Googleplex in Silicon Valley to replace all the carbon emissions your various flying machines spew.)

Since I had to trade kid duty, I could only get up there Friday night for the opening festivities, which are held annually at the Sebastopol headquarters of O’Reilly Media.

Still, it was well worth the trek north of San Francisco to get a short glimpse of some new and sometimes quite wacky ideas about the future of digital development.

The conference is almost entirely user-generated, as people sign up to lead sessions on a variety of sometimes esoteric topics in rooms scattered all through the facility.

While most conferences look at the here and now, Foo Camp is aggressive in its quest to get people to think outside the box. In fact, if there were a box, the brainy denizens of Foo Camp would probably turn it into a time machine/beer dispenser/robot ninja warrior.

It could happen.

Many big wheels and many more big brains were there to figure it all out. Indeed, as you will see from this video, there is still a very pure and very infectious enthusiasm after many years at Foo Camp, even though some have complained about its ever-larger size.

So, for those who wanted to go and could not get in, here is a rather long glimpse at the first night of Foo Camp, including campers trying unsuccessfully to introduce themselves with only three words, a look at the tents, a talk with Tim O’Reilly and some in attendance explaining why they come:

Written by infoproc

June 24, 2007 at 3:51 pm

Posted in foo camp

## Mark to market

The almost collapse of two Bear Stearns hedge funds investing in mortgage-backed securities is sending a tremor through Wall St. A last minute bailout by creditor Merrill means that $800 million in CDOs is about to be marked to market. In other words, some complex, illiquid securities are about to have a meaningful price, as opposed to the theoretical value on the books. Insert words about Long Term Capital Management and “systemic risks” here. “Nobody wants to look at the truth right now because the truth is pretty ugly,” Castillo said. “Where people are willing to bid and where people have them marked are two different places.” I’ve written several posts about CDOs here. This and this are getting a lot of traffic right now. Word is that Gaussian copula models are **way** off from real market prices 🙂 June 21 (Bloomberg) — Merrill Lynch & Co.’s threat to sell$800 million of mortgage securities seized from Bear Stearns Cos. hedge funds is sending shudders across Wall Street.

A sale would give banks, brokerages and investors the one thing they want to avoid: a real price on the bonds in the fund that could serve as a benchmark. The securities are known as collateralized debt obligations, which exceed $1 trillion and comprise the fastest-growing part of the bond market. Because there is little trading in the securities, prices may not reflect the highest rate of mortgage delinquencies in 13 years. An auction that confirms concerns that CDOs are overvalued may spark a chain reaction of writedowns that causes billions of dollars in losses for everyone from hedge funds to pension funds to foreign banks. Bear Stearns, the second-biggest mortgage bond underwriter, also is the biggest broker to hedge funds. “More than a Bear Stearns issue, it’s an industry issue,” said Brad Hintz, an analyst at Sanford C. Bernstein & Co. in New York. Hintz was chief financial officer of Lehman Brothers Holdings Inc., the largest mortgage underwriter, for three years before becoming an analyst in 2001. “How many other hedge funds are holding similar, illiquid, esoteric securities? What are their true prices? What will happen if more blow up?” Shares Fall Shares of Bear Stearns, the fifth-biggest U.S. securities firm by market value, and Merrill, the third-largest, led a decline in financial company stocks yesterday, and the perceived risk of owning their bonds jumped on concerns losses related to subprime home loans may be bigger than initially thought. Both companies are based in New York. The perceived risk of owning corporate bonds jumped to the highest in nine months today. Contracts based on$10 million of debt in the CDX North America Crossover Index rose as much as $10,000 in early trading today to$178,500, according to Deutsche Bank AG. They retraced to $171,500 at 8:28 a.m. in New York. U.S. Securities and Exchange Commission Chairman Christopher Cox said yesterday that the agency’s division of market regulation is tracking the turmoil at the Bear Stearns fund. “Our concerns are with any potential systemic fallout,” Cox said in an interview. Bankers and money managers bundle securities into a CDO, dividing it into pieces with credit ratings as high as AAA. The riskiest parts have no rating because they are first in line for any losses. Investors in this so-called equity portion expect to generate returns of more than 10 percent. Fivefold Increase CDOs were created in 1987 by bankers at now-defunct Drexel Burnham Lambert Inc., the home of one-time junk-bond king Michael Milken. Sales reached$503 billion in 2006, a fivefold increase in three years. More than half of those issued last year contained mortgages made to people with poor credit, little loan history, or high debt, according to Moody’s Investors Service.

New York-based Cohen & Co. was the biggest issuer of CDOs last year. It has formed 36 CDOs since 2001, including 15 worth a total of $14 billion in 2006, according to newsletter Asset-Backed Alert. Not since 1994 have mortgages with past due payments been so high, according to first-quarter data compiled by the Federal Deposit Insurance Corp., the agency that insures deposits at 8,650 U.S. banks. Lehman analysts estimated in April that the collateral backing CDOs had fallen by$25 billion.

“The big question is whether these forced liquidations represent a tipping point in the market,” said Carl Bell, who helps manage $63 billion in fixed-income assets as head of the structured-credit team at Boston-based Putnam Investments. It “may put pressure on other hedge funds pursuing similar strategies” as the Bear Stearns funds, he said. Biggest Names The Bear Stearns funds are run by senior managing director Ralph Cioffi. One of the funds, the 10-month old High-Grade Structured Credit Strategies Enhanced Leverage Fund, lost 20 percent this year, the New York Post reported. Officials at Bear Stearns and Merrill declined to disclose the losses. The funds had borrowed at least$6 billion from the biggest names on Wall Street. Aside from Merrill, other creditors included Goldman Sachs Group Inc., Citigroup Inc., JPMorgan Chase
& Co. and Bank of America Corp. All of the firms are based in New York, except Bank of America, which is based in Charlotte, North Carolina.

As the funds faltered, Merrill sought to protect itself by seizing the assets that were used as collateral for its loans. JPMorgan planned to sell assets linked to its credit lines before
reaching agreement with Bear Stearns to unwind the loan, people with knowledge of the negotiations said yesterday.

Bear Stearns was still in talks late yesterday with creditors to the funds to rescue the funds, said the people, who declined to be identified because the negotiations are private.

Russell Sherman, a Bear Stearns spokesman, and Jessica Oppenheim, a spokeswoman for Merrill, declined to comment.

`Pretty Ugly’

Merrill’s decision yesterday to accept bids on $800 million of bonds it took as collateral for its loans further stifled trading in CDO securities, said David Castillo, who trades asset-backed, commercial-mortgage and CDO bonds in San Francisco at Further Lane Securities. “Nobody wants to look at the truth right now because the truth is pretty ugly,” Castillo said. “Where people are willing to bid and where people have them marked are two different places.” The perceived risk of holding Bear Stearns bonds jumped to a three-month high, according to traders betting on the creditworthiness of companies in the credit-default swaps market. Contracts based on$10 million of its bonds rose $5,800 to$45,500, according to composite prices from London-based CMA Datavision. An increase in the five-year contracts suggests
deterioration in the perception of credit quality. Contracts on Merrill jumped $4,700 to$33,000, CMA prices show.

Long-Term Capital

Shares of Bear Stearns fell for a fourth day, declining 19 cents to $143.01 at 9:32 a.m. in New York Stock Exchange composite trading. The stock was down 12 percent this year before today, compared with the 0.4 percent advance of the Standard & Poor’s 500 Financials Index. Merrill dropped 20 cents to$87.48 and Citigroup fell 13 cents to $53.31. The reaction to the Bear Stearns situation is reminiscent of Long-Term Capital Management LP, which lost$4.6 billion in 1998.

Lenders including Merrill and Bear Stearns met and agreed to take a stake in the Greenwich, Connecticut-based fund and slowly sold the assets to limit the impact of its collapse.

“We’re not surprised to find the principal circle of players is pretty interconnected,” said Roy Smith, professor of finance at New York University Stern School of Business and
former head of Goldman’s London office. “What we’re looking for is whether the interconnection creates a negative domino effect: Whether Hedge Fund A creates a problem for other hedge funds,
which in turn creates a problem for the prime brokers that are lending to them.”

Written by infoproc

June 21, 2007 at 3:26 pm