Tuesday, July 26, 2016

U-SQL

U-SQL is a data processing language that unifies the benefits of SQL with the expressive power of your own code to process all data at any scale. U-SQL’s scalable distributed query capability enables you to efficiently analyze data in the store and across relational stores such as Azure SQL Database. It enables you to process unstructured data by applying schema on read, insert custom logic and UDF's, and includes extensibility to enable fine grained control over how to execute at scale.
U-SQL is the new big data query language of the Azure Data Lake Analytics service. It evolved out of Microsoft's internal Big Data language called SCOPE and combines a familiar SQL-like declarative language with the extensibility and programmability provided by C# types and the C# expression language and big data processing concepts such as “schema on reads”, custom processors and reducers. It also provides the ability to query and combine data from a variety of data sources, including Azure Data Lake Storage, Azure Blob Storage, and Azure SQL DB, Azure SQL Data Warehouse, and SQL Server instances running in Azure VMs. It is however not ANSI SQL.
U-SQL script:
The main unit of a U-SQL “program” is a U-SQL script. A script consists of an optional script prolog and a sequence of U-SQL statements.
@t = EXTRACT date string
        , time string
        , author string
        , tweet string
    FROM "/input/MyTwitterHistory.csv"
    USING Extractors.Csv();

@res = SELECT author
    , COUNT(*) AS tweetcount
    FROM @t
    GROUP BY author;

OUTPUT @res TO "/output/MyTwitterAnalysis.csv"
ORDER BY tweetcount DESC
USING Outputters.Csv();
The above U-SQL script shows the three major steps of processing data with U-SQL:
  • Extract data from your source, using EXTRACT statement in query. The datatypes are based on C# datatypes and it use the built-in Extractors library to read and schematize the CSV file.
  • Transform using SQL and/or custom user defined operators.
  • Output the result either into a file or into a U-SQL table to store it for further processing.
U-SQL combines some familiar concepts from a variety of languages: It is a declarative language like SQL, it follows a dataflow-like composition of statements and expressions like Cascading, and provides simple ways to extend the language with user-defined operators, user-defined aggregators and user-defined functions using C#, and provides a SQL database-like metadata object model to manage, discover and secure structured data and user-code.

Sunday, July 10, 2016

SQL Server Management Objects (SMO)

SQL Server Management Objects (SMO) are .NET objects introduced by Microsoft as of Microsoft SQL Server 2005, designed to allow for easy and simple programmatic management of Microsoft SQL Server. You can use SMO to build customized SQL Server management applications. The SMO object model extends and supersedes the Distributed Management Objects (SQL-DMO) object model. Compared to SQL-DMO, SMO increases performance, control, and ease of use. Most SQL-DMO functionality is included in SMO, and there are various new classes that support new features in SQL Server. The object model is intuitive and uses SQL-DMO terminology, where it is possible, to help transfer your skills.
New features in SMO SQL Sever 2016 include the following:
  • Cached object model and optimized object instance creation. Objects are loaded only when specifically referenced. Object properties are only partially loaded when the object is created. The remaining objects and properties are loaded when they are referenced directly.
  • Batched execution of Transact-SQL statements. Statements are batched to improve network performance.
  • Capture Transact-SQL statements. Allows any operation to be captured into a script. Management Studio uses this capability to script an operation instead of executing it immediately.
  • Management of SQL Server services with the WMI Provider. SQL Server services can be started, stopped, and paused programmatically.
  • Advanced Scripting. Transact-SQL scripts can be generated to re-create SQL Server objects that describe relationships to other objects on the instance of SQL Server.
  • Use of Unique Resource Names (URNs). A URN allows you to create instances of and reference SMO objects.
SMO also represents as new objects or properties many features and components that were introduced in SQL Server 2005. These new features and components include the following:
  • Table and index partitioning for storage of data on a partition scheme.
  • HTTP endpoints for managing SOAP requests.
  • Snapshot isolation and row level versioning for increased concurrency.
  • XML Schema collection, XML indexes and XML datatype provide validation and storage of XML data. For more information, see XML Schema Collections (SQL Server) and Using XML Schemas.
  • Snapshot databases for creating read-only copies of databases.
  • Service Broker support for message-based communication.
  • Synonym support for multiple names of SQL Server database objects.
  • The management of Database Mail that lets you create e-mail servers, e-mail profiles, and e-mail accounts in SQL Server.
  • Registered Servers support for registering connection information.
  • Trace and replay of SQL Server events.
  • Support for certificates and keys for security control.
  • DDL triggers for adding functionality when DDL events occur.
The SMO namespace is Microsoft.SqlServer.Management.Smo. SMO is implemented as a Microsoft.NET Framework assembly. This means that the common language runtime from the Microsoft.NET Framework version 2.0 must be installed before using the SMO objects. The SMO assemblies are installed by default into the Global Assembly Cache (GAC) with the SQL Server SDK option.

Wednesday, June 29, 2016

SQL SERVER 2016 - JSON Data

JSON is a popular textual data format used for exchanging data in modern web and mobile applications. JSON is also used for storing unstructured data in log files or NoSQL databases. Many REST web services return results formatted as JSON text or accept data formatted as JSON. JSON is also the main format for exchanging data between web pages and web servers using AJAX calls.

SQL Server provides built-in functions and operators for following JSON data manipulation:
  • Parse JSON text and read or modify values.
  • Transform arrays of JSON objects into table format.
  • Use any Transact SQL query on the converted JSON objects.
  • Format the results of Transact-SQL queries in JSON format.

Transform JSON text to relational table:
OPENJSON is table-value function (TVF) that seeks into some JSON text, locate an array of JSON objects, iterate through the elements of array, and for each element generates one row in the output result. This feature will be available in CTP3. One example of OPENJSON function in T-SQL query is shown in the following example:
SELECT Number, Customer, Quantity
FROM OPENJSON (@JSalestOrderDetails, '$.OrdersArray')
WITH (
Number varchar(200),
Customer varchar(200),
Quantity int
) AS OrdersArray

Exporting data as JSON:
First feature that will be available in SQL Server 2016 CTP2 is ability to format query results as JSON text using FOR JSON clause. If you are familiar with FOR XML clause you will easily understand FOR JSON. When you add FOR JSON clause at the end of T-SQL SELECT query, SQL Server will take the results, format them as JSON text, and return it to client. Every row will be formatted as one JSON object, values in cells will be generated as values of JSON objects, and column names or aliases will be used as key names. Below is the syntax:
SELECT column, expression, column as alias
FROM table1, table2, table3
FOR JSON [AUTO | PATH]

Tuesday, June 28, 2016

AngularJS

AngularJS is an open source, JavaScript based web application development framework. It can be added to an HTML page with a <script> tag. AngularJS extends HTML attributes with Directives, and binds data to HTML with Expressions. The AngularJS framework works by first reading the HTML page, which has embedded into it additional custom tag attributes. Angular interprets those attributes as directives to bind input or output parts of the page to a model that is represented by standard JavaScript variables. The values of those JavaScript variables can be manually set within the code, or retrieved from static or dynamic JSON resources.

Definition of AngularJS as put by its official documentation is as follows:
"AngularJS is a structural framework for dynamic web applications. It lets you use HTML as your template language and lets you extend HTML's syntax to express your application components clearly and succinctly. Its data binding and dependency injection eliminate much of the code you currently have to write. And it all happens within the browser, making it an ideal partner with any server technology."

Angular allows your application to have an expanded HTML library.

Features of AngularJS are:
  • AngularJS is a efficient framework that can create Rich Internet Applications (RIA).
  • AngularJS provides developers an options to write client side applications using JavaScript in a clean Model View Controller (MVC) way.
  • Applications written in AngularJS are cross-browser compliant. AngularJS automatically handles JavaScript code suitable for each browser.
  • AngularJS is open source, completely free, and used by thousands of developers around the world. It is licensed under the Apache license version 2.0.

Thursday, June 16, 2016

Common Language Infrastructure (CLI)

The Common Language Infrastructure (CLI) provides a specification for executable code and the execution environment (the Virtual Execution System) in which it runs. Executable code is presented to the VES as modules. At the center of the CLI is a unified type system, the Common Type System that is shared by compilers, tools, and the CLI itself. It is the model that defines the rules the CLI follows when declaring, using, and managing types. The CTS establishes a framework that enables cross-language integration, type safety, and high performance code execution. This clause describes the architecture of the CLI by describing the CTS. The following four areas are covered in this clause:
  • The Common Type System (CTS) - The CTS provides a rich type system that supports the types and operations found in many programming languages. The CTS is intended to support the complete implementation of a wide range of programming languages.
  • Metadata - The CLI uses metadata to describe and reference the types defined by the CTS. Metadata is stored (that is, persisted) in a way that is independent of any particular programming language. Thus, metadata provides a common interchange mechanism for use between tools (such as compilers and debuggers) that manipulate programs, as well as between these tools and the VES.
  • The Common Language Specification (CLS) - The CLS is an agreement between language designers and framework (that is, class library) designers. It specifies a subset of the CTS and a set of usage conventions. Languages provide their users the greatest ability to access frameworks by implementing at least those parts of the CTS that are part of the CLS. Similarly, frameworks will be most widely used if their publicly exported aspects (e.g., classes, interfaces, methods, and
  • fields) use only types that are part of the CLS and that adhere to the CLS conventions.
  • The Virtual Execution System (VES) - The VES implements and enforces the CTS model. The VES is responsible for loading and running programs written for the CLI. It provides the services needed to execute managed code and data, using the metadata to connect separately generated modules together at runtime (late binding).
Together, these aspects of the CLI form a unifying infrastructure for designing, developing, deploying, and executing distributed components and applications. The appropriate subset of the CTS is available from each programming language that targets the CLI. Language-based tools communicate with each other and with the VES using metadata to define and reference the types used to construct the application. The VES uses the metadata to create instances of the types as needed and to provide data type information to other parts of the infrastructure (such as remoting services, assembly downloading, and security).

Thursday, June 9, 2016

Test-driven development

Test-driven development (TDD a.k.a. test-first programming) represents an important change to the way most programmers think about writing code. The practice involves writing a test before each new piece of functionality is coded into the system. It is a software development process that relies on the repetition of a very short development cycle: requirements are turned into very specific test cases, then the software is improved to pass the new tests, only. This is opposed to software development that allows software to be added that isn't proven to meet requirements.
An interesting fact about bug-free code is that as the code is further developed, the bug count in it is less likely to increase than in code that is known to have bugs. Developers are more careful to ensure they do not introduce any errors to a solution that is bug free than they are when the solution is riddled with issues. It is what is known as "broken window syndrome". When code starts to fall into disrepair (has bugs), a typical belief follows that a few more bugs will not make any difference because the code is already broken. Therefore, it is important to keep the quality of the code high all the way through the process.
To develop zero-defect software, you need some tools and the correct way of thinking about quality. One of the most powerful tools is unit testing; and by developing the tests before writing the solution code, you are "turning the volume up." NUnit is a .NET Framework class library that can locate unit tests that have been written in your project. NUnit comes with a GUI windows application and a console-based application that both enable you to run the tests and see the results. Tests either pass or fail, as indicated by a green or red progress bar, respectively.
The tests were intended to increase your confidence in the code you are writing. The tests are driving the development process. By writing the tests first, you should find your focus on the code you are writing changes substantially. Your goal now is to define how the class will be used with a test method and then to get the tests to pass.

Tuesday, May 31, 2016

eCommerce : Structure, Information Architecture, SEO

Electronic commerce, commonly written as e-commerce or eCommerce, is the trading or facilitation of trading in products or services using computer networks, such as the Internet. Electronic commerce draws on technologies such as mobile commerce, electronic funds transfer, supply chain management, Internet marketing, online transaction processing, electronic data interchange (EDI), inventory management systems, and automated data collection systems. Modern electronic commerce typically uses the World Wide Web for at least one part of the transaction's life cycle, although it may also use other technologies such as e-mail.

Information Architecture (IA) may sound dull but it’s a critical component of ecommerce and helps put the right data structures and standards in place to enable, amongst other things:
  • Site & catalogue structure
  • Core processes & functions e.g. site search
  • Business reporting & web analytics
  • SEO
Search Engine Optimization (SEO) is really just a series of educated guesses. It is educated guessing mixed with data-driven decisions that lead to more educated guesses and more data-driven decisions. Fear not though because even within all this guessing remain some basic principles and best practices. The most important SEO consideration for an eCommerce website is the website’s categorical structure / website architecture.

nopCommerce - ASP.NET open-source ecommerce software
nopCommerce is the leading open source shopping cart, allowing anyone to set up an online store quickly and easily.
One key feature of the nopCommerce is its pluggable modular/layered architecture which allows additional functionality and presentation elements to be dynamically added to the application at run-time. This pluggable modularized architecture makes it easy to create and manage your web sites.