>> Data Exposed.
>> SQL Unplugged.
>> We haven’t done this week in Channel 9. [LAUGH]
>> We did one.
>> We did, we’ve done a couple.
>> So it’s the last day of the fiscal year here.
>> That’s right.
>> And the beginning of the year we have commitments. And one of our commitments is we’re gonna do x number of shows. And I don’t even remember what I said, but given that it’s the last day of the year.
>> The fiscal year.
>> I thought, maybe we should get one more show in, and so I walked down the hall to my good buddy Scott-
>> Said, what are we gonna talk about [LAUGH].
>> And I said, join me on the show, and you’re here.
>> That’s right. Happy to be here.
>> Were that it were always that easy.
>> That easy.
>> So, we’re gonna-
>> I’m here for you Robert.
>> Talk about some data.
>> You have an entire show, the Data Exposed show, which I highly recommend to people. But let’s kinda step back and say you’re a developer, Video Studio developer, .NET developer. You’ve been working on existing projects for a while. You get a break, you’ve now come up for air, and you wonder what’s new in the world of data in the past, I don’t know, 6 months, year whatever? What are the types of, I hear a lot of terms floating around, Hadoop and HDInsights and data lakes and data factories. And data catalogs and-
>> And blah blah, so there’s data catalog and SQL on Linux and there’s data lakes and, what’s up with all this? [LAUGH] So what’s up with all this, that’s what you’re show is, but give us an overview of what people should learn, where they can go to learn more. What’s been going on?
>> So that’s an interesting questions cuz as we take a step back and go, what’s new in the world of data? There’s been, and it really, [LAUGH] if you’ve had you head on the sand for 12 months, I think we need to talk.
>> Well, not so much head in the sand, but you hear about things that are mentioned in keynotes, but what are these things?
>> What are this things, yeah.
>> The quick pitch, where do I go to learn more, when would l use thing?
>> Yeah, those are good questions and I think what we have to do is we have to take a step back. And go, okay, how do we look at the entire data platform, entire data ecosystem. Because we just released SQL Server 2016. We had Rohan on the Data Exposed show and on SQL Unplugged show, talk about this. And there’s the SQL on Linux which is significant. So these are on-premises products. But then if you look at the whole features of SQL Server 2016, there’s some hybrid things. But then you go out, and you mentioned the HDInsight, and data lakes and all these things. And that’s cloud based and so you really have to take a step back and go, where do I learn about all this stuff, what is this stuff? What’s new in these, and then how do I use these in my environment, right? And I think it really depends on, I don’t wanna call it what your path is, but what are you interested in? What can improve your application, what can improve your industry and your product and things in your apps really.
So you take a step back and okay, where do I go search for this stuff? And really the best place to start is, if you’re trying to figure this stuff out is you hop over, two sites really, it’s MSDN, right? So, first place you go to MSDN, or I think it’s microsoft.com/sqlserver and there’s microsoft.com/sqlserver. And really, here’s everything, some great product information around SQL Server 2016.
>> Okay, so give us the high-level overview of what are the new things? Maybe people are on 2012, they’re on 2014, 26 is out. So for me as a developer, not a database administrator, but me as a developer. What are some of the things I’m gonna care most about?
>> So developer, Rohan actually made this comment, and I don’t know if you know Rohan Kumar is. He basically owns SQL, the relational product at Microsoft. And he’s made the statement that SQL Server 2016 is probably the most significant release of SQL Server in a very long time, probably since 2005, right? So if you look at that and say, okay, what’s interesting for developers? There is a whole list of new features, the SQL 16 is just loaded with new features. So things like a PolyBase, which is interesting, and I’ll talk about these in a little detail in a bit. But you got PolyBase, you’ve got Always Encrypted, you got Stretch Database, you’ve got R integration, you’ve got, In memory OLTP, you’ve got a whole bunch.
>> Now PolyBase, I think each one of these has a developer flavor to it. So PolyBase, we talked Hadoop and HDInsight, right? Take a step back and one of the things that leads into this is, from a data perspective, we have to understand this. Even as developers, not just DBAs but developers, have to understand that data’s just not relational anymore. This is why you have Stream Analytics, and Hadoop, and HDInsight and all these technologies. Allow us to do work in the realm, DocumentDB and all these third party, Couch, Mongo, and things like that. It is just not relational anymore. We’re living in the world of big data and analytics.
>> Why is it not relational?
>> So things, weblogs, right, that’s not relational, CSV files, JSON, XML, whatever. Devices, right, the devices that are generated by our band, or data that’s generated by your band, or your Fitbit or whatever. That’s not technically relational data.
>> And so we have to start, how do we work with this data and so what a lot, Hadoop and HDInsight is this mass processing for just terabytes and terabytes, exabytes of data. So we look at that and say, all right, how do we process this data? And so what PolyBase does in SQL 16 it says, let me use a language that I already know, which is T-SQL, connect to HDInsight and Hadoop as a service.
>> Is that the same thing, HDInsight and Hadoop?
>> Yeah, so Hadoop is Hortonworks Hadoop. Cloudera’s got another big one. HDInsight is Hadoop as a service.
>> So we’re just running Hortonworks version as Hadoop as service in Azure.
>> What it allows me to do is-
>> So software for storing and analyzing massive amounts of structured and unstructured data.
>> Yep, tera-
>> So the band is just sending data. Your heart rate at this point in time was this.
>> Yep, and so if you think about it-
>> You did this many steps in this timeframe.
>> Steps, heart rates, skin temperature, whatever. Then, when we you and I did the last This Week on Channel 9, we talked about the Raspberry Pi. So, think about all these, the data that the Raspberry Pi or Tessels or all these other devices that are generating. So we store on this type of structured, non-structured data, how do we process this? And we wanna be able to use technologies and languages that we already know. So through SQL Server 2016 using PolyBase, I can actually connect and actually write T-SQL statements to query unstructured data. That’s kind of cool, so as a developer, or a front end developer, I can do the same thing with, let me use languages I already know. That’s an interesting one, I like Always Encrypted, from an application. As a developer standpoint, I’m already in the Visual Studio application, C#, whatever, security is always a big issue. And so previously we’ve had to say things like, okay, you gotta go create a key here and a key here.
And then do some pass these key, a public, private key type of thing. And been kind of a headache. And with 16 you’ve got this feature called Always Encrypted. Which just says, hey, you’re encrypted, but the entire data is encrypted across the pipe the entire way, there’s nothing really special you have to do. So when we were going out, a lot of talking to early adopters ISVs, they’re like, that’s one of the features we want right there.
>> So that’s a cool one, what were some of the others I talked about? R’s a good one, right, if you’re into analytics, you’re developer. We’ve got R integration services right in SQL Service, you could write R and talk to SQL Server.
>> And do R, write in analytics right in SQL Server.
>> So, you could also use the R tools for Visual Studio. That’s an extension that’s available, right?
>> Yep, yep, so all this great technology, I think that’s a place to start. If we’re doing a huge push to, how do we start leveraging some of these technologies and some of our ISVs and partners and customers and things like that?
>> So, there’s plenty. I think best place to go is download, there’s a worksheet or a white paper in there that says, or I forgot, what is it called, a factsheet, here’s all the great features and-
>> Still we’re on, still SQL Server?
>> Yeah, still SQL Server. Yeah, go down there, features, pricing. There’s a thing over there that says hey, how to download. The new capabilities, Always Encrypted. Real-Time Operational Analytics, that’s a good one. This allows me to do operational analytics right inside my SQL Server database. You saw there’s temporal tables and things like that. So operational analytics, typically in a company you’ll take your OLTP data, your transactional data, and then you’ll move that off to a reporting server and things like that to do a lot of your BI work. And operational analytics basically says hey let me do A lot of the analytical stuff, right, within my database, right, I don’t have to do ETL processes from point a to point b.
>> Stretch database [CROSSTALK]
>> That’s another one, right.
>> Automatic resizing of databases. [CROSSTALK]
>> Yes, this is an interesting one. Thanks for bringing that one up because there’s just a whole list of these. Stretch Database is interesting one for developers too because this is where from a DBA perspective they’re, okay, my database is just going astronomical. I got all these history tables, right? And the database, the front end developer doesn’t care.
>> But what he cares about is how’s the application running? Do I have to make changes to my code if something happens on the DBA side? Do I, you’re, what kind of changes do I have to make and things like that, right? But, Stretch Database actually says, which is really cool, this is where we get into this hybrid scenario, like Polybase. Hybrid from on-Premise to the Cloud. So, this database says, you know what? We’re actually gonna leverage the Cloud and make this significant change and your application developer dares you to make any changes. So what it does is it says, we’re gonna take a table.
>> We’re actually gonna stretch that table, as we call it, a Search Database. We’re gonna stretch that table into the Cloud.
>> SQL Server still sees it as one table, right? And so we’ve got hot and cold data, right? So any data that you are continuously querying. So things like I’ve got, you don’t order an order table.
>> And I’ve got orders for the last ten years. But I only need, because of compliance reasons or whatever, I’m working with the last two years or three years, right? What am I gonna do with that seven years? Well, we can put it in a history table. Just still taking up the costly disk space. But what this says, is let me just take that orders table, stretch that to the Cloud. So what it does is it spins up an Azure SQL Database Server in the database. And then, copies when you say, all right, I want years four through ten to go up into the Cloud. Readers one through three still say On-premises. SQL Server still sees that as one table. So you don’t have to change as an application developer, I don’t have to change my so say, Select Star From. You should never go Select Star From, but if I query data that happens to be five years ago.
>> SQL Server will take that query and go, this data is in the Cloud go query that data bring that data back and return it, right? So that’s significant for a developer cuz I don’t have to change my code. You do have to take, you might want to implement things like, I just drew a blank on the name, where you’re dealing with latency and connections and I think it’s called the fault, the transit fault handling application framework or something like that. We’re building retries.
>> Just in case there’s things like that. So you may wanna best practices say since we’re doing that anyway, right. But from a plain application standpoint, you don’t have to make any changes right.
>> Mm-hm, all right, so that’s [CROSSTALK]
>> SQL Server [CROSSTALK]
>> That’s 2016 [CROSSTALK]
>> Lots of new there. [CROSSTALK]
>> And under that. [CROSSTALK]
>> Yep, Polybase is the one we’re talking about querying relational, non relational data structures [CROSSTALK] in my structure.
>> You’ve talked about Cloud, there are a number of data options in the Cloud.
>> Yup, so-
>> So give me an overview of those, starting with SQL Database. And then, talk about some of the other ones.
>> Yeah, we could probably do a whole myriad of shows just on the entire, right? But this is-
>> We should have a show on channel 9 devoted to that type of stuff.
>> We should.
>> We should. You should lead it.
>> I should. I should call it Data Exposed. [LAUGH] But if you go there I think the best is Azure.com.
>> Takes you to this page, right. If you get into,what I like here is if you actually go to,go and click on Documentation cuz Products is just kind of an overview. But if you go on Documentation. If you wanna know how to use these features you go to Documentation and sorta groups them in the same way. But like Data and Storage,SQL Database, Document DB-
>> The SQL Database this is literally-
>> Is SQL Server-
>> As a service, right.
>> Which version, is it 2016?
>> So running in the Cloud.
>> Running in the Cloud, right.
>> Most of everything that On-premises can do.
>> That’s pretty on par, right. I think there’s a few things-
>> There’s a handful of things that aren’t [INAUDIBLE]
>> Small handful of things like service broker and things like that, right.
>> Document DBs are kind of a no SQL in document store in the Cloud, allows me to store JSON documents and index those, things like that, Couch or Mongo, things like that, right?
>> But this is primarily on the Cloud.
>> Is there an on prem version of that?
>> No, they get that asked all the time [LAUGH] they’re like, no.
>> No, I’m not quite sure if that’s even on the road now, right.
>> Then you have a Storage, Blob, Tables, Queues and that’s been around for ever. That’s our kind of no SQL, not document store. Blobs actually allows you to store actual documents, right.
>> Then you have SQL Data Warehouse and we talked that Stretch Database.
>> So, it’s-
>> So SQL Data Warehouse is SQL Server Data Warehousing in the Cloud, right. And so your BI component, you’ll be able to do data warehousing in the Cloud. They just took it and made it available. And same thing the On-premises version of data warehouse it’s now available in the Cloud, and in SQL Server Stretch Databases, what we talked about, this is the-
>> so what’s StorSimple? Is it data or is it storage?
>> It’s hybrid cloud storage, if you ask me, I don’t wana lie to you, I’m not up on StorSimple, so we should probably move on from there.
>> I’ve got primary storage backups, archives, okay, so it’s not necessarily about data. Per se. Not the way we think about it.
>> Yeah, it’s not, hey let me store relational data, things like that.
>> Okay, cool.
>> Then you get into Analytics, right. If you go to Intelligence and this what Analytics, right, cuz this is where a lot of the things we’ve talked about [CROSSTALK]
>> Here’s lake factory catalog and [CROSSTALK]
>> Machine Learning, Stream Analytics, all that good stuff, right.
>> So, give me an overview of all of these.
>> So, think about the Reader’s Digest version of this is, think about what we talked about tassels and pies, the whole IoT space, the Internet of things [CROSSTALK]
>> What are you talking about
>> Tussles and pie?
>> Tussles and pie [LAUGH]. In our last [CROSSTALK].
>> For the folks who missed that or don’t remember.
>> So think about it, you’ve got these little raspberry pies, tussles.
>> The device.
>> The devices, like your band, these little devices that allow your to, that generate data. It could be your phone, whatever.
>> Massive amounts of data.
>> Massive, massive amounts of data. How do we consume that data? And how do we store that data? How do we process that data. When we get into the Internet of things, it’s how do we process that data. They call them the three v’s volume, velocity and variety, you know? The volume, the just mass amounts of like you saw mass amounts of data, right? The variety’s just the different types, relational, unrelational, structured, semi structured and then you have volume variety and then velocity right velocity is. How fast are we consuming that data, do we need to stream that data in? But for me it means how we need to understand that data?
>> Right, so we have things like these devices, do we have it? Is it here? If you go to, it might be, there’s Event Hubs. There’s one, if you go to go to intelligence it might be under. So event hubs and there’s IoT hub. It should be under Internet of Things. There you go. IoT hub right.
>> Event hub, a lot of these are because Stream Analytics, Machine Learning, that’s also in Analytics. On the Analytics tab so you’ve got some but yeah, bitHub is an IoT hub, this is where you take that data, you consume it, you send it in to event hub or IoT hub, and this is where kind of a temporary storage place where it gets ready to be picked up. So you’ll have Stream Analytics, it’s a streaming service that’ll go into Event Hub, or IoT, it’ll pick that data up and then route it to Hot path cold path you wanna like stream and I’ll actually say, hey, I need to right now. So let me send the data to power BI, power BI dashboard right?
>> Or maybe I wanna process it later so let me store it and send it to if you go back to Analytics and we send it to either. A storage or a Data Lake Store, right? For mass storage, right?
>> So what’s the Data Lake Store?
>> So this is if you go back to, so if Data Lake Store and you go back to Data. Remember there’s a data head, a storage. Data Lake Store is a high hyper scale, no imitation, data storage mechanism, right.
>> It allows me to store just any type of data, right.
>> Got it.
>> So no file size limitation. No limitation to how much data you can have. It allows me to just do, and on top of that because it’s in Azure data, I can run ht insight on top of that data. Things like that right.
>> So a place to store this massive amount of data that these things are sending and data analytic workloads. It’s not necessarily stuff that comes from the band. It could be stuff that comes from your-
>> E-commerce side.
>> E-commerce side right.
>> Any type of data that’s being ingested. Where do you want to ingest that and where do you wanna store that, right.
>> Mm-hm. And then, because of Azure Data Lake then you have Azure Data Analytics which is the new used SQL language. It allows me to write, and this is interesting for developers so if you are a developer of Visual Studio go look at, go back. I’ll go to Documentation here. Is it listed here? Is it Analytics? We went into Analytics, I wonder if it’s listed there? Go to try Data Link Analytics because there’s this I got look at through run. There’s this. We came as doing at language called the U SQL stands by Unified SQL, right. So, it allows in Visual Studio. In fact, that ought to be. That’s Robert and I next show together is U SQL.
>> So, you’re saying I did this Visual Studio in U SQL. I will find something.
>> No not MySQL. This is live folks. No.
>> No. Let’s do this,
>> Did I get U-SQL wrong?
>> Actually let’s just do, u-SQL.
>> U-SQL. CROSSTALK]
>> Right there you go, the new data language for, there you go. Right there.
>> Okay? That’s it right.
>> The new big data language for Azure Data Lake. Allows me basically so in Visual Studio I can write all these massive queries.
>> SQL plus C#.
>> SQL plus C#.
>> Right so maybe we should do a show on that.
>> We should.
>> We should do it because it’s right to Visual Studio man.
>> So this is if you want to get into analytics. Right? U-SQL is your tool of choice. Get a combined C sharp. It allows me to write T-SQL and combine those. That’s what we call unified SQL. It allows me to do very hyper-scale analytical queries and jobs for processing all of that data. Right.
>> So you have all this data sitting in Azure data store and you want to process that data and see insight. It allows me to use SQL say go execute jobs and do analytics across all that data.
>> So that’s one if you’re a Visual Studio developer. Go look at U-SQL right.
>> Okay. That’s another one. The machine learning. Data factors is another one. This is kind of a data movement perspective, right. It allows me to take data and do transformation and kick off jobs and things like that. It’s kind of what kind of a data movement ETL type of tool right for
>> Isn’t that something that a developer would be
>> Yep because stream analytics and data factor have visual studio plugins, right?
>> It’s a good question. They both have Visual Studio plug in. You can actually right streamline jobs or create a data factory pipelines in Visual Studio and then, publish those at a Cloud. So, you can do offering, and debugging, and things like that right in Visual Studio. That’s an awesome one.
>> Machine learning I don’t know think there’s a Visual Studio plugin but in fact I doubt there’s one but I could be wrong. But I think this machine learning this is your predictive analytics right? I’m gonna take all this data and run it through and kinda let me learn, look for patterns in my data, kind of predictive analytics type of stuff. But your big tools, data like analytics with U-SQL, stream analytics, data factory, you can actually do, actually if you go back to the data in storage with HT insight, a lot of these not be via internet and somewhere in there. Somewhere in there’s HT insight. I can act the other HT insight plug in it.
>> The HT insight is under analytics.
>> Yeah, analytics. There’s HT insight plug in for Visual Studio.
>> I can actually write map reduce jobs, right. And fire those off, right.
>> So what’s the data catalog?
>> So data catalog, I just learned about this. This is, how do I, it’s all about, so if I’m in an organization, how do I discover? You join Microsoft, right? [LAUGH] How do I know where data is? If i wanna go find, where’s this type of data? Where’s this type of data? Right? Azure Data Catalog is a data discovery for just to type a data source as you have.
>> Discover understand and consume data source. Is that kinda cool?
>> Yeah. it’s actually pretty yeah. But it’s actualizes at the point where hey you now have securities and privileges and rights. You can get granules through what access to what but it’s It makes it easy for people to come in and say okay, where is this data that I’m looking for? So it made data discovery very easy. And then the show we just did is kind of the glossary on top of that. So I can kind of, because
>> You’re just showing that?
>> One, yep. The last show we just did, right? Business glossary. That says, you may have a term for one thing, and I may have a term for a different term for the same thing. How do we kind of come to agree on a term? And then, we can actually add tags to these terms. So, we can make it easily searchable and things like that, right?
>> Right, cool.
>> So, it’s kind of more of a data discovery indexing type of thing. So don’t know if there’s anything in Visual Studio there. But a lot of this stuff is now being moved to Visual Studio from a developer perspective. Cuz if I want to offer things for the Cloud, such as streaming or data movement, ETL processes, pipelines, or if I want to pick up map reduce jobs, a lot of this stuff you can now do through Visual Studio.
>> Very powerful, very cool. If you’re new into this stuff and want to just. A lot of these, if you just go up to
>> A great place to learn is the documentation.
>> Yeah, start the documentation. Like for example, if you go to [CROSSTALK] analytics and then say, all right, tell me about Stream analytics, really actually Data Factory, right. I go to Data Factory and there’s a whole thing, learn pass, overview, [CROSSTALK]. Build your first pipeline, develop a [CROSSTALK]. This is the actual place to start. It gives you a high level but actually what I would like to this is, every single one of these has a learning path, right. Start here do this there’s a links to these things where goes on. And it’s kind of a step by step to this, to this. And then want you hey, if you’re ready to start getting deeper then you get into that dive in and start one of about that advance things like run samples like. In a manage pipelines, debug pipelines you know. Allow and allow these are, and there’s U-SQL right there? A lot of these have things like how to do it in Visual Studio like, what was I just looking at? If you look at Data Factory, you go back, this is kinda fun.
Go back here and say overview, or build your first pipeline, right. Well, I wanna do it and using the portal or I wanna use it doing Power Shell or using Visual Studio or use it, right? A lot of these, if you’ve Visual Studio, you can do it right through, right.
>> Visual Studio.
>> Nice. Download the latest as your data factory plugin. From the visual studio gallery [CROSSTALK] A lot of these are awesome. All right.
>> Yeah, so this is actually pretty cool. It allows to say, hey I wanna work in the world of big data. But I wanna do it, I don’t wanna have to open a browser. So the tools are getting, it’s version 0.9., right? So a lot of these tools are still in
>> 63,000 downloads, that’s pretty good.
>> Still, yeah significant download, right. So people are starting to, how do we play with this. And they’re taking feedback, and so if you want to. Even if you want to just start playing with it, you know This is a great place to learn and then provide feedback.
>> One of the cool things about Azure, that team has done an amazingly excellent job with their website in just showing you how to get started doing things. A cool number of tutorials.
>> And I just see that, I wish and I was just talking to someone about this, if you go back. Need to go back a couple. You’re going back to that main page, where was it at. You know here where we can go. It would be nice if we had something like that for SQL server, right? Where we say, hey, learn how to do x, learn how to do that, right? Give me a playground to play with that. Azure is doing a phenomenal job as you see, right? I do a phenomenal job and even doing a phenomenal from a tooling perspective. I’d like to see SQL kind of go that way. And not that they’re doing a bad job. I’d like to see SQL kind of go hey how do we because all the things we focus on from a SQL perspective is for like DBAs, right? And I think there’s like a gap for developer. Front end developers, right? Give me a playground. How do I do X? How do I take advantage of, if I’m a developer, some of these things. There’s a push for that.
>> Come on.
>> So lots of fun.
>> We could do that on the show.
>> We can do that yeah.
>> We could build vaults on this show.
>> That’s right.
>> Just let me know.
>> Not doing every week.
>> And there is plenty to do from Visual Studio if you look at this. There’s plenty we can do from a big data perspective. All right How will you start, how do you fire up and do U-SQL or you do some of things things. So there is plenty that I can talk about.
>> Cool. Well I can forward featuring some of the stuff?
>> I’m looking forward this too. This will be fun.
>> And so. Excellent
>> Back to the Robert discussion.
>> Starting tomorrow which will be the next for two years.
>> [LAUGH] That just got a while. I’m planning.
>> All right. Thanks. Thanks for coming up
>> My pressure. This is the whole tand.
>> Let’s get a wooing tool, had a lot of the stuff that sounds there as you see. Fairly easy to go learn about it and start playing around it. Thanks so much.
>> My pleasure. Always a pleasure.
>> We’ll see you next time on Visual Studio Toolbox.
>> Cheers. [SOUND]
- SQL Server 2016 [03:30] -> Some of the exciting new features in SQL Server 2016 include StretchDB (stretch your on-premises cold data transparently and securely to the cloud), Polybase (access and combine both relational and non-relational data all from within SQL Server), Always Encrypted (enables clients to encrypt sensitive data inside client applications without ever revealing the encryption keys to the database engine), R Services (a platform for developing and deploying intelligent applications using the powerful R language), and operational analytics (run both analytics and OLTP workloads in the same database).
- Cloud [13:00] -> Focusing on the cloud, Scott discusses many of the Azure services that focus on data starting with SQL Database and Azure SQL Data Warehouse.
- Analytics [15:15] -> Azure Cortana Intelligence suite of services, including Event Hub and IoT Hub (data ingress services), Stream Analytics (real-time event processing engine), Azure Data Factory (data integration and orchestration service), Azure Data Lake Store (hyper-scale repository for big data analytic workloads).
- U-SQL [20:10] -> A language that unifies the benefits of SQL with the expressive power of .NET for Azure Data Lake Analytics.
- Visual Studio plug-ins [22:00]: Visual Studio plug-ins for both Azure Data Factory and Azure Stream Analytics.