Data science is the new "Hot Major"

That’s what a lot of Data Science departments appear to be teaching under that label. Those, with a couple of DS specific courses.

A lot of interesting opinions on what to look for and not in these programs (alas, mostly blogs and forums so can’t link). Some stay just do statistics undergrad and go, or that DS come from Stat, CS, math even engineering. There was a caution about how much things have changed in the past 20 years so just having R won’t cut it anymore.

…and of course the required soft skills - managing a team, applying to the business you’re working in, etc.

here is an article about this topic - http://www.zdnet.com/article/infographic-most-companies-are-collecting-data-but-arent-using-big-data-solutions/

If colleges have trouble finding teachers for CS how will they find teachers for data science?

Most of the data scientists I know have PhDs in statistics. I think stats would be a safer option.

YES. God save us from the data scientists who just want to “mess around in the data” to “see what we can find” rather than having a thought-out research plan with research questions to answer and constructs to explore.

(Actual phrases I have head from actual data scientists that have actually made me want to put my head through my desk.)

An interesting read, for those who want to think more, is Weapons of Math Destruction, by Cathy O’Neil. She writes about how unintelligent analysis of big data can actually do a whole lot more harm than good.

^^^ My misconception about data science before I started learning about it was you have a bunch of data, you unleash a bunch of machine learning algorithms on the data, out pops a bunch valuable new information that nobody else knows about, and you use that new information to make a bunch of money.

The reality is you have to have a specific question you want answered before you do anything, and you spend 80% of your time trying to manipulating the data to get it into a form that’s usable. Then you run the algorithms on the data, and the rest of the time is spent trying to figure out if the results are valid. It’s very easy to screw things up and get totally incorrect and misleading results.

Yes, that’s why during the interview I would give them a business problem related to data to see how they would solve it.

We encourage our competitors to use exactly this approach.

What is the future of AI and machine learning? I am afraid it could be a world of machine and automation without too much jobs remaining for data scientists. Many data scientists are initially needed to create various systems, but once data systems are been set up at highly automated fashion, they do not need the same equal number of data scientists to maintain those systems anymore. This kind of phenomenon had been seen in the field of MIS during 2000-2010; MIS was hot in the 90’s. I remember the MIS department at University of Central Florida was terminated a few years back and many of their professors tried to re-tool to be university administrators (I heard most of them failed to do so). That was a brutal scene.

McKinsey puts data processing as the second easy category of tasks (the easiest one being predicable physical tasks) to be replaced by automation: http://www.mckinsey.com/business-functions/digital-mckinsey/our-insights/where-machines-could-replace-humans-and-where-they-cant-yet

Perhaps the hard part about data science is actually explaining the conclusions and fallacies that may be pulled out of the data.

For example, here on these forums, it is common for posters to assume that a college’s admission rate is the main indicator of its admission selectivity, but that assumption fails to account for the different strength or weakness of the applicant pools to different colleges.

Another common fallacy is the assumption of inherent racial qualities of Asian students being over-represented in colleges, even though the selection of recent Asian immigrants for skilled workers and PhD students (i.e. high educational attainment already) is the likely reason (since high educational attainment tends to transmit strongly across generations, whether you see it as nature or nurture or both).

Haven’t read the whole thread, but I always like threads about data science. :slight_smile:

My eldest graduated from MIT in 2016 with a math degree. He was trying for the math/CS degree, but didn’t quite get there. However, he self-taught various data science languages, and indeed works as a data scientist. He’s always loved, loved, loved statistics and data. I pegged him to be a sports statistician (and who knows, he might become one some day). I think the niche languages (and probably it’s been mentioned upthread) are r, ruby, python, and one other that I’m forgetting. He is considering a PhD in Machine Learning or Stats, but is doing very well with just his BS for the moment.

I can’t imagine someone like him being replaced by a machine, but who knows what the future will bring.

SAS, maybe.

I don’t think that’s going to happen anytime soon. Data science is not systems administration.

Back in the late 80s-early 90s, there was a lot of talk about how AI was going to wipe out software jobs, but if anything, AI is creating more software jobs.

My oldest son is about to graduate from a state school with a degree in Business Administration concentration in data analytics. He has had no problem getting internships and job offers. UPS offered him a job.

He is not a data scientist, but says you really need coding skills python and sql and to be able to work with Tableau and Excell even in the marketing application jobs… Lots of mixed applications for data analyst jobs -some are more marketing related creating data based marketing plans-some more data coding related. He seemed to indicate there are different directions you can take this field in.??

I don’t understand any of it- but everyone one in his major found multiple decent paying jobs.

SAS is an analytics software, not a programming language.

Well so is R. Or they both can be programming languages or analytics software, one is open source and one is proprietary.

SAS is a language that must be programmed. SAS is trying to stay current/relevant by adding analytics packages, but R and Python rule Data Science. Hadoop/Spark is also important in the field.

http://data.berkeley.edu/education/faqs

Here are the likely requirements for the major and minor.

Scala?

Here is the latest news on the new Data Science major.

https://data.berkeley.edu/education/faqs

This is flavor of the month. I work in a somewhat related area, and my specialty experienced a similar curve. Some publication announces that it’s the best/hottest/etc. job, half of the schools in the country rush in to create a major or a master’s program, the market is over saturated with people whose diploma says X. And by that time there are enough software packages and/or customized setups within companies requiring minimal maintenance and just a handful of people to keep it going. The key to succeeding in an area like this is to be there first, before itofficially becomes hot, and having solid foundation in math to be able to learn it on your own. So, no to a degree in data science, yes - to math/ applied math/ statistics/ economics with a good grasp of computer programming.