Wednesday, August 26, 2015

Node.js Architecture

Node is an open source toolkit for developing server side applications based on the V8 JavaScript engine. Like Node, V8 is written in C++ and is mostly known for being used in Google Chrome.
Node is part of the Server Side JavaScript environment and extend JavaScript API to offer usual server side functionalities. Node base API can be extended by using the CommonJS module system.

NodeJS is divided into two main components: the core and its modules. The core is built in C and C++. It combines Google’s V8 JavaScript engine with Node’s Libuv library and protocol bindings including sockets and HTTP.
V8 Runtime Environment
Google’s V8 engine is an open-source Just In Time, or JIT, compiler written in C++. In recent benchmarks, V8’s performance has surpassed other JavaScript interpreters including Spider Monkey and Nitro. It has additionally surpassed PHP, Ruby and Python performance. Due to Google’s approach, it is predicted that in fact it could become as fast as C. The engine compiles JavaScript directly into assembly code ready for execution by avoiding any intermediary representations like tokens and opcodes which are further interpreted. The runtime environment is itself divided into three majors components: a compiler, an optimizer and a garbage collector.
Libuv
The C++ Libuv library is responsible for Node’s asynchronous I/O operations and main event loop. It is composed of a fixed-size thread pool from which a thread is allocated for each I/O operation. By delegating these time-consuming operations to the Libuv module, the V8 engine and remainder of NodeJS is free to continue executing other requests. Before 2012, Node relied on two separate libraries, Libio and Libev, in order to provide asynchronous I/O and support the main event loop. However, Libev was only supported by Unix. In order to add Windows support, the Libio library was fashioned as an abstraction around Libev. 

Sunday, July 26, 2015

What is Node.js?

Node.js is a platform built on Chrome's JavaScript runtime for easily building fast, scalable network applications. Node.js uses an event-driven, non-blocking I/O model that makes it lightweight and efficient, perfect for data-intensive real-time applications that run across distributed devices.

Node.js is an open source, cross-platform runtime environment for server-side and networking applications. Node.js applications are written in JavaScript and can be run within the Node.js runtime on OS X, Microsoft Windows, Linux, FreeBSD, NonStop, IBM AIX, IBM System z and IBM i. Its work is hosted and supported by the Node.js Foundation, a Collaborative Project at Linux Foundation. Node.js provides an event-driven architecture and a non-blocking I/O API that optimizes an application's throughput and scalability. These technologies are commonly used for real-time web applications.

Node.js uses the Google V8 JavaScript engine to execute code, and a large percentage of the basic modules are written in JavaScript. Node.js contains a built-in library to allow applications to act as a Web server without software such as Apache HTTP Server or IIS. Node.js is gaining adoption as a server-side platform and is used by IBM, Microsoft, Yahoo!, Walmart, Groupon, SAP, LinkedIn, Rakuten, PayPal, Voxer, and GoDaddy.

Node.js was invented in 2009 by Ryan Dahl, and other developers working at Joyent. Node.js was created and first published for Linux use in 2009. Its development and maintenance was spearheaded by Ryan Dahl and sponsored by Joyent, the firm where Dahl worked.

Node.js allows the creation of web servers and networking tools, using JavaScript and a collection of "modules" that handle various core functionality. Modules handle file system I/O, networking (HTTP, TCP, UDP, DNS, or TLS/SSL), binary data (buffers), cryptography functions, data streams, and other core functions. Node's modules have a simple and elegant API, reducing the complexity of writing server applications.

Frameworks can be used to accelerate the development of applications, and common frameworks are Express.js, Socket.IO and Connect. Node.js applications can run on Microsoft Windows, Unix and Mac OS X servers. Node.js applications can alternatively be written with CoffeeScript (a more readable form of JavaScript), Dart or Microsoft TypeScript (strongly typed forms of JavaScript), or any language that can compile to JavaScript.

Monday, July 13, 2015

Evolution of the Microsoft NOS

"NOS" is the term used to describe a networked environment in which various types of resources, such as user, group, and computer accounts, are stored in a central repository that is controlled and accessible to end users. Typically a NOS environment is comprised of one or more servers that provide NOS services, such as authentication and account manipulation, and multiple end users that access those services.
Microsoft's first integrated NOS environment became available in 1990 with the release of Windows NT 3.0, which combined many features of the LAN Manager protocols and OS/2 operating system. The NT NOS slowly evolved over the next eight years until Active Directory was first released in beta in 1997.
Under Windows NT, the "domain" concept was introduced, providing a way to group resources based on administrative and security boundaries. NT domains are flat structures limited to about 40,000 objects (users, groups, and computers). For large organizations, this limitation imposed superficial boundaries on the design of the domain structure. Often, domains were geographically limited as well because the replication of data between domain controllers (i.e., servers providing the NOS services to end users) performed poorly over high-latency or low-bandwidth links. Another significant problem with the NT NOS was delegation of administration, which typically tended to be an all-or-nothing matter at the domain level.
Microsoft was well aware of these limitations and needed to rearchitect their NOS model into something that would be much more scalable and flexible. For that reason, they looked to LDAP-based directory services as a possible solution.
In generic terms, a directory service is a repository of network, application, or NOS information that is useful to multiple applications or users. Under this definition, the Windows NT NOS is a type of directory service. In fact, there are many different types of directories, including Internet white pages, email systems, and even the Domain Name System (DNS). While each of these systems have characteristics of a directory service, X.500 and the Lightweight Directory Access Protocol (LDAP) define the standards for how a true directory service is implemented and accessed.
Windows NT and Active Directory both provide directory services to clients (Windows NT in a more generic sense). And while both share some common concepts, such as Security Identifiers (SIDs) to identify security principals, they are very different from a feature, scalability, and functionality point of view. Below Table contains a comparison of features between Windows NT and Active Directory.
Windows NT
Active Directory
Single-master replication is used, from the PDC master to the BDC subordinates.
Multimaster replication is used between all domain controllers.
Domain is the smallest unit of partitioning.
Naming Contexts and Application Partitions are the smallest unit of partitioning.
System policies can be used locally on machines or set at the domain level.
Group policies can be managed centrally and used by clients throughout the forest based on domain, site or OU criteria.
Data cannot be stored hierarchically within a domain.
Data can be stored in a hierarchical manner using OUs.
Domain is the smallest unit of security delegation and administration.
A property of an object is the smallest unit of security delegation/administration.
NetBIOS and WINS used for name resolution.
DNS is used for name resolution.
Object is the smallest unit of replication.
Attribute is the smallest unit of replication.
In Windows Server 2003 Active Directory, some attributes replicate on a per-value basis (such as the member attribute of group objects).
Maximum recommended database size for SAM is 40 MB.
Recommended maximum database size for Active Directory is 70 TB.
Maximum effective number of users is 40,000 (if you accept the recommended 40 MB maximum).
The maximum number of objects is in the tens of millions.
Four domain models (single, single-master, multimaster, complete-trust) required to solve per-domain admin-boundary and user-limit problems.
No domain models required as the complete-trust model is implemented. One-way trusts can be implemented manually.
Schema is not extensible.
Schema is fully extensible.
Data can only be accessed through a Microsoft API.
Supports LDAP, which is the standard protocol used by directories, applications, and clients that want to access directory data. Allows for cross-platform data access and management.

Table: A comparison between Windows NT and Active Directory

MS SQL Server 2016

SQL Server 2016 delivers breakthrough mission-critical capabilities with in-memory performance and operational analytics built-in. Comprehensive security features like new Always Encrypted technology helps protect your data at rest and in motion, and a world class high availability and disaster recovery solution adds new enhancements to AlwaysOn technology.
Benefits:
  • Enhanced in-memory performance provide up to 30x faster transactions, more than 100x faster queries than disk based relational databases and real-time operational analytics
  • New Always Encrypted technology helps protect your data at rest and in motion, on-premises and in the cloud, with master keys sitting with the application, without application changes
  • Built-in advanced analytics– provide the scalability and performance benefits of building and running your advanced analytics algorithms directly in the core SQL Server transactional database
  • Business insights through rich visualizations on mobile devices with native apps for Windows, iOS and Android
  • Simplify management of relational and non-relational data with ability to query both through standard T-SQL using PolyBase technology
  • Stretch Database technology keeps more of your customer’s historical data at your fingertips by transparently stretching your warm and cold OLTP data to Microsoft Azure in a secure manner  without application changes
  • Faster hybrid backups, high availability and disaster recovery scenarios to backup and restore your on-premises databases to Microsoft Azure and place your SQL Server AlwaysOn secondaries in Azure

Key Capabilities in SQL Server 2016 CTP2:
Always Encrypted
Always Encrypted, based on technology from Microsoft Research, protects data at rest and in motion. With Always Encrypted, SQL Server can perform operations on encrypted data and best of all, the encryption key resides with the application in the customers trusted environment. Encryption and decryption of data happens transparently inside the application which minimizes the changes that have to be made to existing applications.
Stretch Database
This new technology allows you to dynamically stretch your warm and cold transactional data to Microsoft Azure, so your operational data is always at hand, no matter the size, and you benefit from the low cost of Azure.  You can use Always Encrypted with Stretch Database to extend data in a more secure manner for greater peace of mind.
Real-time Operational Analytics & In-Memory OLTP
For In-Memory OLTP, which customers today are using for up to 30x faster transactions, you will now be able to apply this tuned transaction performance technology to a significantly greater number of applications and benefit from increased concurrency.  With these enhancements, we introduce the unique capability to use our in-memory columnstore delivering 100X faster queries on top of in-memory OLTP to provide real-time operational analytics while accelerating transaction performance.
Additional capabilities in SQL Server 2016 CTP2 include:
  • PolyBase – More easily manage relational and non-relational data with the simplicity of T-SQL.
  • AlwaysOn Enhancements – Achieve even higher availability and performance of your secondaries, with up to 3 synchronous replicas, DTC support and round-robin load balancing of the secondaries.
  • Row Level Security – Enables customers to control access to data based on the characteristics of the user. Security is implemented inside the database, requiring no modifications to the application.
  • Dynamic Data Masking – Supports real-time obfuscation of data so data requesters do not get access to unauthorized data.  Helps protect sensitive data even when it is not encrypted.
  • Native JSON support – Allows easy parsing and storing of JSON and exporting relational data to JSON.
  • Temporal Database support – Tracks historical data changes with temporal database support.
  • Query Data Store – Acts as a flight data recorder for a database, giving full history of query execution so DBAs can pinpoint expensive/regressed queries and tune query performance.
  • MDS enhancements – Offer enhanced server management capabilities for Master Data Services.
  • Enhanced hybrid backup to Azure – Enables faster backups to Microsoft Azure and faster restores to SQL Server in Azure Virtual Machines.  Also, you can stage backups on-premises prior to uploading to Azure.

Advanced Message Queuing Protocol (AMQP)

The Advanced Message Queuing Protocol (AMQP) is an open standard that defines a protocol for systems to exchange messages. AMQP defines not only the interaction that happens between a consumer/producer and a broker, but also the over-the-wire representation of the messages and commands that are being exchanged. Since it specifies the wire format for messages, AMQP is truly interoperable - nothing is left to the interpretation of a particular vendor or hosting platform.
AMQP was originated in 2003 by John O'Hara at JPMorgan Chase in London, UK. From the beginning AMQP was conceived as a co-operative open effort. Initial development was by JPMorgan Chase. AMQP is a binary, application layer protocol, designed to efficiently support a wide variety of messaging applications and communication patterns. It provides flow controlled, message-oriented communication with message-delivery guarantees such as at-most-once (where each message is delivered once or never), at-least-once (where each message is certain to be delivered, but may do so multiple times) and exactly-once (where the message will always certainly arrive and do so only once), and authentication and/or encryption based on SASL and/or TLS. It assumes an underlying reliable transport layer protocol such as Transmission Control Protocol (TCP).
List of core concepts of AMQP:
  • Broker: This is a middleware application that can receive messages produced by publishers and deliver them to consumers or to another broker.
  • Virtual host: This is a virtual division in a broker that allows the segregation of publishers, consumers, and all the AMQP constructs they depend upon, usually for security reasons (such as multitenancy).
  • Connection: This is a physical network (TCP) connection between a publisher/consumer and a broker. The connection only closes on client disconnection or in the case of a network or broker failure.
  • Channel: This is a logical connection between a publisher/consumer and a broker. Multiple channels can be established within a single connection. Channels allow the isolation of the interaction between a particular client and broker so that they don't interfere with each other. This happens without opening costly individual TCP connections. A channel can close when a protocol error occurs.
  • Exchange: This is the initial destination for all published messages and the entity in charge of applying routing rules for these messages to reach their destinations. Routing rules include the following: direct (point-to-point), topic (publish-subscribe) and fanout (multicast).
  • Queue: This is the final destination for messages ready to be consumed. A single message can be copied and can reach multiple queues if the exchange's routing rule says so.
  • Binding: This is a virtual connection between an exchange and a queue that enables messages to flow from the former to the latter. A routing key can be associated with a binding in relation to the exchange routing rule.

Comparison of some main features between AMQP and another protocols:
  • Java Message Service (JMS): Unlike AMQP, this only defines the wire protocol for a Java programming interface and not messages. As such, JMS is not interoperable and only works when compatible clients and brokers are used. Moreover, unlike AMQP, it does not define the commands necessary to completely configure messaging routes, leaving too much room for vendorspecific approaches. Finally, in JMS, message producers target a particular destination (queue or topic), meaning the clients need to know about the target topology. In AMQP, the routing logic is encapsulated in exchanges, sparing the publishers from this knowledge.
  • MQ Telemetry Transport (MQTT): This is an extremely lightweight message-queuing protocol. MQTT focuses only on the publish-subscribe model. Like AMQP, it is interoperable and is very well suited for massive deployments in embedded systems. Like AMQP, it relies on a broker for subscription management and message routing. RabbitMQ can speak the MQTT protocol—thanks to an extension.
  • ØMQ (also known as ZeroMQ): This offers messaging semantics without the need for a centralized broker (but without the persistence and delivery guarantees that a broker provides). At its core, it is an interoperable networking library. Implemented in many languages, it's a tool of choice for the construction of high-performance and highly-available distributed systems.
  • Process inboxes: Programming languages and platforms such as Erlang or Akka offer messaging semantics too. They rely on a clustering technology to distribute messages between processes or actors. Since they are embedded in the hosting applications, they are not designed for interoperability. 


Wednesday, May 20, 2015

Comparison - Elasticsearch vs MongoDB

Elasticsearch MongoDB
Description A modern enterprise search engine based on Apache Lucene One of the most popular document stores
DB-Engines Ranking Rank - 14, Score - 64.83 Rank - 4, Score - 277.32
Database model Search engine Document store
Developer Apache Software Foundation MongoDB, Inc
Initial release 2010 2009
License Open Source Open Source
Database as a Service No No
Implementation language Java C++
Server operating systems All OS with a Java VM Linux, OS X, Solaris, Windows
Data scheme schema-free schema-free
APIs and other access methods Java API, RESTful HTTP/JSON API proprietary protocol using JSON
Server-side scripts No JavaScript
Triggers Yes No
Partitioning methods Sharding Sharding
Replication methods Yes Master-slave replication
MapReduce No Yes
Consistency concepts Eventual Consistency Eventual Consistency, Immediate Consistency
Foreign keys No No
Transaction concepts No No
Durability Yes Yes

Tuesday, May 19, 2015

NoSQL Database

NoSQL encompasses a wide variety of different database technologies that were developed in response to a rise in the volume of data stored about users, objects and products, the frequency in which this data is accessed, and performance and processing needs. Relational databases, on the other hand, were not designed to cope with the scale and agility challenges that face modern applications, nor were they built to take advantage of the cheap storage and processing power available today.
NoSQL DEFINITION: Next Generation Databases mostly addressing some of the points: being non-relational, distributed, open-source and horizontally scalable.
A NoSQL (often interpreted as Not only SQL) database provides a mechanism for storage and retrieval of data that is modeled in means other than the tabular relations used in relational databases. Motivations for this approach include simplicity of design, horizontal scaling, and finer control over availability. The data structures used by NoSQL databases (e.g. key-value, graph, or document) differ from those used in relational databases, making some operations faster in NoSQL and others faster in relational databases. The particular suitability of a given NoSQL database depends on the problem it must solve.
NoSQL Database Types
  • Document databases pair each key with a complex data structure known as a document. Documents can contain many different key-value pairs, or key-array pairs, or even nested documents.
  • Graph stores are used to store information about networks, such as social connections. Graph stores include Neo4J and HyperGraphDB.
  • Key-value stores are the simplest NoSQL databases. Every single item in the database is stored as an attribute name (or "key"), together with its value. Examples of key-value stores are Riak and Voldemort. Some key-value stores, such as Redis, allow each value to have a type, such as "integer", which adds functionality.
  • Wide-column stores such as Cassandra and HBase are optimized for queries over large datasets, and store columns of data together, instead of rows.

The Benefits of NoSQL
When compared to relational databases, NoSQL databases are more scalable and provide superior performance, and their data model addresses several issues that the relational model is not designed to address:
  • Large volumes of structured, semi-structured, and unstructured data
  • Agile sprints, quick iteration, and frequent code pushes
  • Object-oriented programming that is easy to use and flexible
  • Efficient, scale-out architecture instead of expensive, monolithic architecture