Skip to main content

Posts

Showing posts from April 4, 2016

When it comes to wrangling data at scale, R, Python, Scala, and Java have you covered -- mostly

When it comes to wrangling data at scale, R, Python, Scala, and Java have you covered -- mostly You have a big data project. You understand the problem domain, you know what infrastructure to use, and maybe you've even decided on the framework you will use to process all that data, but one decision looms large: What language should I choose? (Or perhaps more pointed: What language should I force all my developers and data scientists to suffer?) It's a question that can be put off for only so long. [ Download the InfoWorld quick guide:  Learn to crunch big data with R . | Sign up for  InfoWorld's Big Data Report  to stay atop all the latest news and developments in the field. ] Sure, there's nothing stopping you from doing big data work with, say, XSLT transformations (a good April Fools' suggestion for tomorrow, simply to see the looks on everybody's faces). But in general, there are three languages of choice for big data these days -- R, Python, and Sca