Successfully reported this slideshow.
We use your LinkedIn profile and activity data to personalize ads and to show you more relevant ads. You can change your ad preferences anytime.

Using C# with U-SQL (SQLBits 2016)

680 views

Published on

Using C# with U-SQL (SQLBits 2016 ADL/USQL Pre-Conference)
U-SQL Extensibility, Inline C#, C# UDFs, U-SQL Assemblies

Published in: Data & Analytics
  • Login to see the comments

  • Be the first to like this

Using C# with U-SQL (SQLBits 2016)

  1. 1. Michael Rys Principal Program Manager, Big Data @ Microsoft @MikeDoesBigData, {mrys, usql}@microsoft.com Using C# with U-SQL 2016/04/04
  2. 2. Extensible From Ground Up • Type system is based on C# • Expression language IS C# • User-defined functions (U-SQL and C#) • User-defined Aggregators (C#) • User-defined Operators (UDO) (C#) U-SQL provides the Parallelization and Scale-out Framework for Usercode • EXTRACTOR, OUTPUTTER, PROCESSOR, REDUCER, COMBINER, APPLIER REFERENCE MyDB.MyAssembly; CREATE TABLE T( cid int, first_order DateTime , last_order DateTime, order_count int , order_amount float ); @o = EXTRACT oid int, cid int, odate DateTime, amount float FROM "/input/orders.txt" USING Extractors.Csv(); @c = EXTRACT cid int, name string, city string FROM "/input/customers.txt" USING Extractors.Csv(); @j = SELECT c.cid, MIN(o.odate) AS firstorder , MAX(o.date) AS lastorder, COUNT(o.oid) AS ordercnt , AGG<MyAgg.MySum>(c.amount) AS totalamount FROM @c AS c LEFT OUTER JOIN @o AS o ON c.cid == o.cid WHERE c.city.StartsWith("New") && MyNamespace.MyClass.MyFunction(o.odate) > 10 GROUP BY c.cid; OUTPUT @j TO "/output/result.txt" USING new MyData.Write();
  3. 3. Managing Assemblies Create assemblies Reference assemblies Enumerate assemblies Drop assemblies • CREATE ASSEMBLY db.assembly FROM @path; • CREATE ASSEMBLY db.assembly FROM byte[]; • Can also include additional resource files • REFERENCE ASSEMBLY db.assembly; • Referencing .Net Framework Assemblies • Always accessible system namespaces: • U-SQL specific (e.g., for SQL.MAP) • All provided by system.dll system.core.dll system.data.dll, System.Runtime.Serialization.dll, mscorelib.dll (e.g., System.Text, System.Text.RegularExpressions, System.Linq) • Add all other .Net Framework Assemblies with: REFERENCE SYSTEM ASSEMBLY [System.XML]; • Enumerating Assemblies • Powershell command • U-SQL Studio Server Explorer • DROP ASSEMBLY db.assembly;
  4. 4. Assembly Dependencies • Assembly must be registered to be referenced • All Assemblies needed for compilation must be referenced in script • All Assemblies needed at runtime either • Need to be referenced in script, or • Need to be registered with the assembly as additional files • Metadata Service does NOT enforce dependencies • Visual Studio Extension provides support for dependency management
  5. 5. Additional Resources MSDN Article https://msdn.microsoft.com/en- us/magazine/mt614251 Sample Data https://github.com/Azure/usql/tree/master/Exampl es/Samples/Data/Tweets Sample Project https://github.com/Azure/usql/tree/master/Exampl es/TweetAnalysis
  6. 6. http://aka.ms/AzureDataLake

×