4
I Use This!
Moderate Activity

News

Analyzed about 5 hours ago. based on code collected about 10 hours ago.
Posted over 11 years ago by Park Kieun
In the previous article I have explained how data is stored in CUBRID RDBMS. I have described the concept of objects, classes, inheritance, and OID in CUBRID. In this article I will talk about the data types available in CUBRID and how they are ... [More] stored in the database. Since CUBRID is an object-relational DBMS, it allows you to define your own custom data type, and I will show you in this article how to that. OverviewA relational database system is designed to manage limited number of data types such as integer, floating point value number, string, boolean, date, time, and currency. In other words, it is not designed to manage the data types which users have custom created. On the other hand, one of the core points of the object-oriented data model is that the functions to create custom and new data types are supported in a consistent way. Moreover, the object-oriented data model supports encapsulation of data and the programs that access the data, and the framework for user-defined data types, as well as the structure, relations, and constraints of data expressed on the relational model. With encapsulation and inheritance (reuse), it essentially relieves the difficulty of designing a complex software system. In CUBRID, the attributes (columns) defined in classes (tables) have their names and domains. The name is used as an identifier of a column and the domain specifies the type of data that the column can have. The data type defines the format to save the data to the database. In addition to the following default data types that can be used as domains, other classes (tables) in a database can also be specified as a domain. If you want to use a class (table) as a domain, instead of the default data type, use the class name (table name) as the name of the data type. But there is one more unique way to use the class as a domain: to use any class as a domain of the column, specify the domain to OBJECT type. Default Data Types Numeric data types SHORT or SMALLINT, INT or INTEGER, BIGINT, NUMERIC or DECIMAL, FLOAT or REAL, DOUBLE or DOUBLE PRECISION, MONETARY Date/Time data types DATE, TIME, TIMESTAMP, DATETIME Bit string data types BIT, BIT VARYING String data types CHAR, VARCHAR, CHAR VARYING, or STRING, NCHAR, NCHAR VARYING BLOB/CLOB data types BLOB, CLOB Collection data types SET, MULTISET, LIST, SEQUENCE or ENUM ENUM Each data type in a database has its value, which denotes the allowed size a user can enter for this data type, as well as a specific format this data type can have. The INTEGER data type, for instance, is a 32-bit integer, i.e. 4 bytes, which is the value of this INTEGER data type. The STRING data type is a string with a variable length, i.e., bytes with a specific length. In other words, a data type is a method to interpret the number of bytes the data of the column occupies in a record. When the data type of a column is INTEGER, the value of the column is 4 bytes and the 4 bytes are interpreted as an integer value. When the data type of a column is STRING, it is a string with a variable length. The length of the string is saved at the beginning of the string which is followed by the string itself. If the length of a string is 255 or less, which can be saved in 1 byte, the first byte indicates the length of this string and the following bytes are the string data. If the length of a string is longer than 255, 0xFF is saved in the first byte while the actual length is saved in the integer form in the next 4 bytes. In this matter, a data type or a domain stands for the format of saving various types of values in the database. Class Domains (or tables as data types) You now know how default data types are stored in CUBRID. But which storage format will be used for the column that has classes (tables) as domains (data types)? Will the column save the structure of a class specified as a domain as it is? For example, as illustrated in the following Figure 1, if the person class is specified as a domain of the emp attribute of the employee class, you may think that the storage type of the emp column would be the sum of name, dob, ssn, and marital_status columns. However, it is not in CUBRID. From the storage format side, the emp column has OID (Object Identifier) only. This is the difference between CUBRID and Oracle or SQL3 (SQL:1999) standard. If a database has the object-oriented concept, there are two ways: Oracle defines a new complex data type using simple data types and specifies it as the data type of a column. CUBRID refers to other classes. Figure 1: Example of Using a Class (Table) as a Domain (Data Type). So, as mentioned earlier, each data type has a value (number of bytes) as well as a specific format. In a database, a unique value is defined regardless of the data type. It is NULL. This value has neither value nor a data type. For example, the ‘PARK.K.E’ string, one of the values of the name column in the person table in the above figure, belongs to the domain whose data type is VARCHAR(30). The domain with the VARCHAR(30) data type starts from an empty string ‘ ’ of which length is 1 to 'ÿÿ…ÿ’ filled with ÿ, the last character of ISO 8859-1 Latin 8-bit character set of which length is 30. However, NULL can be the value of a type such as INTEGER, NUMERIC, STRING and BLOB. In other words, all domains of a database include NULL value. The INTEGER type has an integer value which can be expressed by using 4 bytes + NULL value as its domain. Let's get back to the class domain. CUBRID allows specifying other classes (tables) as domains (data types). However, the value which is actually saved is an OID value which indicates the physical address of an instance object of the class which is specified as a domain. Here you may think that all class domains have an identical structure and there is no distinction among them, but it is not true. The user-defined schema information saves the information about which class was specified as a domain. Whenever a user tries to set a value to a column, the data type is checked. For the class domain, it is checked whether the object of the class specified in the schema is referred to or not. In CUBRID, the type was defined and built into the system in advance and the domain is specified by a user at the database design phase. Therefore, the domain is declared and defined for the default data types in advance. In the source code, the data structure that shows the domain information is the TP_DOMAIN structure defined in the object/object_domain.h file. In this object/object_domain.h file, the domain data structure for a default data type, such as tp_Integer_domain, is already defined. The default data type such as INTEGER and STRING has its pre-defined global variable in the TP_DOMAIN type. When a user specifies a class as a domain, the default data type dynamically creates and saves the data structure of the TP_DOMAIN structure. This is how class domains are stored and referred in CUBRID RDBMS. I will discuss more about TP_DOMAIN data structure in one of the coming articles. ConclusionThis concludes the talk about Data Types, Domains, Classes and Attributes in CUBRID. You now know that in CUBRID you can easily create custom data types which represent one of the existing tables which is a really great and convenient feature. This allows you to inherit certain attributes of base tables and extend them. In the next article I will explain about different modes of execution in CUBRID such as client-server mode or standalone mode. The client-server architecture of CUBRID enhances the processing speed and allows to distribute database functions to multiple hosts as several clients cache the database object saved in the server. I will explain more about the advantages of this client-server concept in the next article. [Less]
Posted over 11 years ago by Park Kieun
In this article I will explain how data is stored in CUBRID RDBMS. I will describe the concept of objects, classes, and OID in CUBRID. One of the characteristics of CUBRID that is often discussed as an "extension of relational data model" is ... [More] its object-oriented model. CUBRID has a lot of object-oriented concepts. All data records are considered as objects which contain records and tables that define the structure are considered as classes that define objects. It is implemented by using the object-oriented concepts and provides the relational model and the SQL (the relational query language) to users. In addition, it provides an "extended relational data modes" such as inheritance between classes, collection data types (SET, MULTISET, LIST) and composition relation. For the relational data model, it is not allowed that a single column has multiple values. In CUBRID, however, you can define multiple values for a column. For this purpose, collection data types are provided in CUBRID. Collection data types are divided into SET, MULTISET and LIST depending on whether the duplication of elements is allowed or not. Inheritance Inheritance is a concept that allows reusing in child tables the columns and methods defined in parent tables. CUBRID supports inheritance for reusability. By using inheritance feature provided by CUBRID, you can create a parent table with some common columns and then create child tables inherited from the parent table with some unique columns added. This way, you can model a database minimizing the number of columns needed. OID In a relational database, the relation is defined by allowing the referring table to have the primary key of the referred table as a foreign key. If the primary key consists of multiple columns or the size of the key is big, the performance of join operations between tables will degrade. However, CUBRID allows the direct use of the physical address (OID) where the records of the referred table are located, so you can define relations without using join operations. That is, in an object-oriented database like CUBRID, you can create a composition relation where one record has a reference value to another by using the column displayed in the referred table as a domain (type), instead of referring to the primary key column from the referred table. Generally, in object-oriented programs objects are the actual data stored in the memory where object pointers are used to point to those objects. Conversely, CUBRID directly handles the database objects, so it cannot express objects. Instead, it issues a unique Object Identifier (OID) for each object. OID indicates the physical address of a database object, the absolute location in the database volume file on the disk. Like a memory pointer that stands for the physical address in the memory area, an OID is the physical address in the database area. The OID, the physical address of a database object, consists of a volume number (volid), a page number in the volume (pageid), and a slot number in the page (slotid). The following code is an excerpt from CUBRID source code which defined OID. typedef struct db_identifier DB_IDENTIFIER; struct db_identifier {   int pageid;   short slotid;   short volid; }; typedef DB_IDENTIFIER OID; Slotted pagesA DB object or a data record in CUBRID is saved in a slotted page, the traditional disk storage structure of an RDBMS. A DBMS is stored on the disk, so disk I/O is made by disk pages (or database pages) in the same way as the operating system does. A size of a database page in CUBRID is 4KB or 16KB, the latter being the default page size (see --db-page-size). One page includes several data records (or DB objects). Therefore, to obtain a specific record data, the location of the record in a page and the length of the record should be known. The simplest way to obtain the data easily is to arrange records one by one from the start of the page. However, when a new record is created or another record is deleted, the contents and lengths of existing records frequently change. So, we need a way to avoid moving the location of other records whenever a record is changed and to quickly find out the location of the desired record (from which byte the data of the corresponding record in the page is read) while the length of each record is different from each other. For this reason, most DBMSs implement the slotted page structure. Figure 1: CUBRID Slotted Page Structure. As shown in Figure 1, one page has several records, and the location of each record is indicated on the slot area at the end of the page. One slot size is 4 bytes, and the slots are numbered from the end of the page as slot 1, slot 2, ..., slot N. Slot 1 indicates the location of a record 1, i.e., the offset in the page, and slot 2 indicates the location of a record 2. In the above figure, slot 6 does not indicate the location of the record (the value is -1) which means that record 6 has been deleted. As shown in the above figure, records are filled and saved from the start of the page while slots are filled and saved from the end of the page. A slot saves the offset along with the size of the record. The 4-byte slot consists of 14-bit offset, 14-bit length, and 4-bit record type. Finally, the record type is expressed by using 4 bits, and the record types can be up to 16. There are 7 types of slot pages in CUBRID, and the types are shown in the storage/slotted_page.h of the source code. (#define REC_HOME 2 is the code that shows the record type.) As shown in the above code snippet of OID structure, an OID consists of a volume number, a page number, and a slot number. By using a volume number, you can find the file where record is stored. To see which part of a file should be read, a page number is used. The slot number shows where on that page the desired record data is located. Classes Like the general object-oriented concept, the structure of DB objects in CUBRID is expressed through classes. It is similar to the data record and table schema of a database. For an object-oriented language such as Java and C++, a class is a frame used to declare the object structure which does not physically exist. However, it is different in CUBRID where a class is also a kind of a DB object. In other words, a class is one of the data records which have certain information just like other DB objects. A DB instance object is a record that has the user data. A DB class object is also a record which has the data about the structure (table schema) of instance objects (records) that belong to the corresponding class (table). A table schema data shows columns in the table, the data types of each column, and the table constraints or column constraints defined by the user. In a general relational DBMS, this data is called the schema data or data dictionary. It is saved and managed at a separate space as a special format. However, in CUBRID, it is handled as one of the DB objects according to the object-oriented concept of CUBRID. From the CUBRID source code, you can see that a DB object is handled by distinguishing the object type (class object or instance object). Root Class If a class (table schema) is considered as a DB object, then how is the structure of class objects defined? All objects should have their class. In this case, is a separate class required for class objects? The answer is yes. Class objects belong to a class called root class. Therefore, class objects are the instance objects of a root class. In other words, they are the records on the table that is defined as a root class. Figure 2: Root Class and Class Object in CUBRID. A root class can be considered as a table that contains the table definitions. From the SQL standard-defined concept, this table is called an information schema, generally called a system catalog. To see which tables are defined in the database, a SELECT table_name FROM tables; query is executed. To see which columns are defined for a table, a SELECT column_name FROM columns WHERE table_name='t1'; query is executed. The tables or columns tables used here are the information schema, i.e. system catalog tables. To be exact, the system catalog tables in CUBRID (e.g., db_class tables) separately exist and the class objects are separately saved and managed. From a structural point of view, system catalog tables are identical to general tables: they are the tables that the system has created in advance when a database is created. The class objects are internally used. The record structure (or schema) of a class object is defined in a source code in advance. So, when the class object is read from the disk, it is converted to a memory object for interpreting the contents. The C language structure which defines the memory object structure of a class object is struct sm_class in the object/class_object.h file. ConclusionThis concludes the talk about how data is stored in CUBRID RDBMS? I have explained the individual blocks of the storage like data volumes, pages, and page slots which point to actual data records. Now you know what constitutes an Object Identifier (OID). When directly accessing the data stored on the disk using OID, CUBRID will bypass the recalculation of physical address of a record being requested since you already provide the OID of the record. This provides an added performance. Also in this article you have learned that everything in CUBRID is an object which is either of class object or instance object type. Instance objects are stored in class objects, while class objects are stored in the Root Class. In the next article I will talk about Data Types and Domains in CUBRID and how you can inherit data types. [Less]
Posted over 11 years ago by Park Kieun
In this blog I will describe the concurrency control methods implemented in database management systems, and the differences between them. I will also explain about what locking technique is used in CUBRID RDBMS, about locking modes and their ... [More] compatibility, and finally, the deadlocks and the solution for them. Overview When multiple transactions, which change the data, are executed simultaneously, it is required to control the order of processing these transactions to satisfy the ACID (Atomicity, Consistency, Integrity, Durability) property of the database. Executing multiple transactions simultaneously should lead to the same result as executing each transaction independently, in other words, one transaction should not be affected by another transaction. If different data is changed for each transaction, no interference between transactions is made, so there is no issue. However, if the same data is simultaneously changed by multiple transactions, the order of processing each transaction should be controlled. Types of Concurrency Control For example, the T1 transaction changes the A record from 1 to 2 and then changes the B record, the T2 transaction can simultaneously change the A record, too. Let's assume that the T2 transaction changes the A record from 2 to 4 by adding +2. If two transactions are successfully terminated, there is no issue. But it is important that all transactions can be rolled back. If the T1 transaction is rolled back, the value of the A record should be returned to 1, i.e. the value before the T1 transaction was executed. This is to satisfy the ACID property of the database. However, the T2 transaction has already changed the A record value to 3. So, it is impossible to return the A record to 1 regardless of the situation. In this case, there can be two options. Two-phase locking (2PL)The first one is when the T2 transaction tries to change the A record, it knows that the T1 transaction has already changed the A record and waits until the T1 transaction is completed because the T2 transaction cannot know whether the T1 transaction will be committed or rolled back. This method is called Two-phase locking (2PL). Multi-version concurrency control (MVCC)The other one is to allow each of them, T1 and T2 transactions, to have their own changed versions. Even when the T1 transaction has changed the A record from 1 to 2, the T1 transaction leaves the original value 1 as it is and writes that the T1 transaction version of the A record is 2. Then, the following T2 transaction changes the A record from 1 to 3, not from 2 to 4, and writes that the T2 transaction version of the A record is 3.When the T1 transaction is rolled back, it does not matter if the 2, the T1 transaction version, is not applied to the A record. After that, if the T2 transaction is committed, the 3, the T2 transaction version, will be applied to the A record. If the T1 transaction is committed prior to the T2 transaction, the A record is changed to 2, and then to 3 at the time of committing the T2 transaction. The final database status is identical to the status of executing each transaction independently, without any impact on other transactions. Therefore, it satisfies the ACID property. This method is called Multi-version concurrency control (MVCC). CUBRID has implemented 2PL method as well as DB2 and SQL Server, while Oracle, InnoDB and PostgreSQL have implemented MVCC. Two-phase locking in CUBRIDThe 2PL adopted by CUBRID uses locks to ensure the consistency between transactions that change the identical data. As the "lock" literally means, the locking is executed through two phases: expanding phase (acquiring) shrinking phase (releasing) More accurately, all transactions should acquire lock for the data to be accessed and the acquired locks are released only when the transaction is terminated. After a transaction has acquired the lock for a certain data (regardless of the lock type, S_LOCK for read, stands for Shared Lock, or X_LOCK for write, stands for Exclusive Lock), when another transaction tries to acquire a new lock for the data, the new lock is allowed or pended depending on the lock compatibility rule. Therefore, success or failure of the prior transaction does not have impact on the following transactions, so the data consistency is maintained. Lock Manager in CUBRID Thus, the key point of 2PL, adopted by CUBRID, is that the lock must be processed through two phases: expanding phase and shrinking phase. Then, [Figure 1] release all locks, acquired while executing a transaction, only after the transaction ends (commit or rollback). Figure 1: Two-Phase Locking. 2PL concurrency control method naturally controls access to the identical data from transactions by making all transactions observe the 2PL protocol. The following Figure 2 below shows an example of three transactions using 2PL: Transaction 1 executes B=B+A operation, Transaction 2 executes C=A+B operation, and Transaction 3 executes Print C operation. Since all three transactions are accessing the data A, B and C, the concurrency control is required. In this case, each transaction is executed according to the 2PL protocol so that there is no data conflict. Figure 2: Concurrency Control by using 2PL. Lock modesTo understand the concurrency control of multiple transactions more deeply, let's discuss about lock modes, lock conversion and transaction isolation level. In the above figure, you can see that S-lock, Shared Lock, for A was first acquired by Transaction 1, but it is also acquired by Transaction 2, too. On the contrary, the transaction which requested X-lock is blocked until S-lock is released. In this matter, a variety of lock modes are used to minimize conflicts by lockers. Major types of locks utilized in DBMSs are. Shared (S) Lock: Used for read operation. It is generally set on the target record when SELECT statement is executed. It blocks a transaction from changing data which was already read by other transactions. Exclusive (X) Lock: Used for write-operations such as INSERT, UPDATE, DELETE. It blocks one data from being changed by multiple transactions. Update (U) Lock: Used to define that the target resource will be changed. It is used to minimize deadlock which may occur when multiple transactions are executing both read and write. Intent Shared (IS) Lock: Set on the upper resource (e.g. tables) to set the S-lock on some lower resources (e.g. records or pages). It is to prevent other transactions from setting X-lock on the upper resource. Intent lock will soon be described. Intent Exclusive (IX) Lock: Set on the upper resource to set X-lock on some lower resources. Shared with Intent Exclusive (SIX) Lock: Set on the upper resource to set S-lock and X-lock on some lower resources. Lock mode compatibilityAmong the lock modes above, intent locks are used to improve the transaction concurrency and to prevent deadlock between the upper resources and the lower resources. For example, when Transaction A tries to read Record R on Table T, it sets IS_LOCK on Table T before setting S_LOCK on Record R. Then, Transaction B is prevented from setting X_LOCK on Table T to change the structure of Table T. If Transaction A has not set IS_LOCK on Table T, Transaction B would change the structure of Table T. Then, Transaction A would perform a wrong read operation. This way Transaction B has no need to check all records in Table T to check whether there is any lock set by other transactions for setting X_LOCK on Table T. The following lock mode compatibility table will clearly show the effect of intent locks: Table 1: The lock mode compatibility table of CUBRID. Current Lock Mode NULL IS NS S IX SIX U NX X Newly-requested Lock Mode NULL True True True True True True True True True IS True True N/A True True True N/A N/A False NS True N/A True True N/A N/A False True False S True True True True False False False False False IX True True N/A False True False N/A N/A False SIX True True N/A False False False N/A N/A False U True N/A True True N/A N/A False False False NX True N/A True False N/A N/A False False False X True False False False False False False False False From the lock mode compatibility table, you can see that X_LOCK cannot be set on a table if IS_LOCK is set on the table. And only IS_LOCK can be compatible with SIX_LOCK. This means that SIX_LOCK intends to set S_LOCK and X_LOCK on the record and it will not allow any lock but IS_LOCK for S_LOCK on other non-conflicting records. From the table, you can see that IX_LOCK and IX_LOCK can be compatible with each other. IX_LOCK intends to set X_LOCK for some records. So, the compatibility is available. If there are two transactions that try to change an identical record, IX_LOCK for the table is allowed. However, there is no problem in concurrency control since only the transaction that has acquired X_LOCK for the record first can change the record (X_LOCK and X_LOCK are not compatible). The lock mode compatibility table is expressed as a global variable lock_Comp[][] in the lock_table.c file in CUBRID source code. Among CUBRID sources, most codes related to lock modes are implemented in lock_manager.c file. To set lock on a data object, the lock_object() function is used which receives three parameters: the OID of an object where the lock mode will be set, the OID of the class where the object belongs, and the desired lock mode. In the source code of the function, you can see that the function is executed in several ways based on the target of the lock mode, the lock mode for an instance object or for a class object. Take note of this: in CUBRID, a class object is also an object. Keep it in mind that a class object has an OID and all class objects are the instances of a root class, so it uses ROOTOID, the OID of the root object, as its class OID. From the code, you can see that the required intent lock is set on a class object when a lock mode is required for an instance object. And there is a concept of lock waiting time in the lock mode request. To retrieve the lock timeout value set on the current transaction, the logtb_find_wait_secs() function is called. CUBRID supports the SET TRANSACTION LOCK TIMEOUT SQL command and the setLockTimeout() method in JDBC. The command is to specify the lock timeout of the current transaction. Lock waiting time means the time for a transaction, which has made a request for lock mode, to wait when a lock mode is set on an object by a transaction and the requested lock is not compatible with the already-set lock mode. As you have seen before, the 2PL concurrency control method does not allow lock from other transactions until the existing lock is released. For the following two reasons, lock timeout should be set by a transaction: When a user does not want to wait too long because of the lock mode. To lower the frequency of deadlock. Deadlocks A deadlock occurs when two or more transactions request resources locked by each of them, so all transactions cannot be progressed. Figure 8 below shows an example of a deadlock. Figure 2: Transaction Deadlock. First, Transaction 1 executes UPDATE participant SET gold=10 WHERE host_year=2004 AND nation_code=’KOR’ statement and sets X_LOCK on the ‘KOR’ record. Transaction 2 sets X_LOCK on the ‘JPN’ record. Transaction 3 sets X_LOCK on the ‘CHN’ record. After that, Transaction 1 requests X_LOCK on the ‘JPN’ record for executing UPDATE for that record. However, the ‘JPN’ record is already locked with X_LOCK by Transaction 2. So, Transaction 1 should wait until Transaction 2 ends. Based on the 2PL protocol, the X_LOCK is released when the transaction ends. Transaction 2 requests X_LOCK on the ‘CHN’record and waits for Transaction 3. Finally, Transaction 3 waits for Transaction 1 to acquire the 'KOR' record of Transaction 1 as it has X_LOCK on the ‘CHN’ record. As a result,Transaction 1 waits for Transaction 2 to end, Transaction 2 waits for Transaction 3 to end, and Transaction 3 waits for Transaction 1 to end. So, no transaction can be progressed. This is called a deadlock. Most DBMSs which use the 2PL method, including CUBRID, use the deadlock detection method to solve the deadlock problem. It periodically checks whether the cycle illustrated in the above figure occurs by drawing a Lock Wait Graph for the transactions being executed. In CUBRID, the thread for detecting deadlock checks the Lock Wait Graph every second. When a deadlock is detected, one transaction among the transactions is randomly selected and aborted by force. This is called unilateral abort. When a transaction is selected as a victim to be sacrificed to solve the deadlock and unilaterally aborted, the corresponding SQL statement returns an error code. The error message is "The transaction has timed out due to deadlock while waiting for X_LOCK for an object. It waited until User 2 ended.” When an error is returned and the application aborts the transaction, the locks of the transaction are released and other transactions can be continuously processed. To see how the deadlock is detected, see the lock_detect_local_deadlock() function in the source code. This function is called with the intervals (in seconds) specified by the PRM_LK_RUN_DEADLOCK_INTERVAL variable (the deadlock_detection_interval_in_secs parameter in cubrid.conf file) on the background thread which executes thread_deadlock_detect_thread(). Even if a deadlock does not occur, when the execution time of a transaction is too long, other transactions should wait for too long as well. For a certain application, it is wiser to give up rather than wait. In particular, when a web server has called DB tasks and the wait time is too long, all threads of the web server are used to process the DB, so they cannot be used to process external HTTP requests any more, causing service failures. Therefore, for a web application, the threads should be returned without waiting an unlimited amount of time for DB processing even if an error occurs. Two methods are used for that: One is lock timeout supported by CUBRID. The other is query cancel. JDBC is defined with an API which can cancel the SQL statement being executed. The key data structure of the lock manager is defined in the lock_manager.c file. typedef struct lk_entry LK_ENTRY; struct lk_entry { #if defined(SERVER_MODE)   struct lk_res *res_head;      /* back to resource entry           */   THREAD_ENTRY *thrd_entry;     /* thread entry pointer             */   int tran_index;               /* transaction table index          */   LOCK granted_mode;            /* granted lock mode                */   LOCK blocked_mode;            /* blocked lock mode                */   int count;                    /* number of lock requests          */   struct lk_entry *next;        /* next entry                       */   struct lk_entry *tran_next;   /* list of locks that trans. holds  */   struct lk_entry *class_entry; /* ptr. to class lk_entry           */   LK_ACQUISITION_HISTORY *history;      /* lock acquisition history         */   LK_ACQUISITION_HISTORY *recent;       /* last node of history list        */   int ngranules;                /* number of finer granules         */   int mlk_count;                /* number of instant lock requests  */   unsigned char scanid_bitset[1];       /* PRM_LK_MAX_SCANID_BIT/8];       */ #else                           /* not SERVER_MODE */   int dummy; #endif                          /* not SERVER_MODE */ }; typedef struct lk_res LK_RES; struct lk_res {   MUTEX_T res_mutex;            /* resource mutex */   LOCK_RESOURCE_TYPE type;      /* type of resource: class,instance */   OID oid;   OID class_oid;   LOCK total_holders_mode;      /* total mode of the holders */   LOCK total_waiters_mode;      /* total mode of the waiters */   LK_ENTRY *holder;             /* lock holder list */   LK_ENTRY *waiter;             /* lock waiter list */   LK_ENTRY *non2pl;             /* non2pl list */   LK_RES *hash_next;            /* for hash chain */ }; From the file, the lk_Gl global variable of LK_GLOBAL_DATA type is the core. The LK_ENTRY structure stands for the lock itself. For example, when the Transaction T1 has requested a lock, one LK_ENTRY is created. LK_RES is a structure that shows to which resource the lock belongs. In CUBRID, all resources are objects (instance objects and class objects), so they are shaped as OIDs. In the LK_RES structure, you can see the list of holders with LK_ENTRY type and the list of waiters. The list of holders is a list of transactions that hold the lock for the resource now. For example, when Transaction T1 and Transaction T2 have acquired S_LOCK for the data record with OID1, LK_ENTRY that corresponds to the S_LOCK of T1 and T2 will be registered in the list of holders. When Transaction T3 requests X_LOCK on the OID1 record, T3 should wait because of the existing S_LOCK. So, the LK_ENTRY corresponding to X_LOCK of T3 will be registered to the list of waiters. Which lock is held by which transaction is maintained in the tran_lock_table variable which has the LK_TRAN_LOCK structure as a table. The Wait For Graph for detecting a deadlock is expressed as TWFG_node and TWFG_edge of the LK_WFG_NODE structure and the LK_WFG_EDGE structure. The lock_detect_local_deadlock() function creates a Wait For Graph and detects whether there is a cycle on the graph. When a cycle is detected, the lock_select_deadlock_victim() function selects a victim transaction to be sacrificed for solving the deadlock. For reference, transactions are continuously executed while a Wait For Graph is drawn up and checked, the information of the ended transaction is removed from the graph. The victim transaction is selected based on the following criteria: If a transaction is not a holder, it cannot be a victim. When a transaction is in the commit phase or the rollback phase, it cannot be selected as a victim. Select a transaction of which lock timeout is not set to -1 (unlimited waiting) first. Select the latest transaction rather than the older one. (The transaction ID is an incremental number. A transaction with smaller transaction number is the older one.) ConclusionThis concludes the talk about Two-Phase Locking in CUBRID. I briefly covered the types of concurrency control, the difference between 2PL and MVCC, about what locking technique is used in CUBRID RDBMS, about locking modes and their compatibility, and finally, the deadlocks and the solution for them. In this article I have mentioned about OID (Object Identifiers) which are used to identify instance objects as well as class objects. In the next article I will continue this talk and explain what Object, Class, and OID are. [Less]
Posted over 11 years ago by Esen Sagynov
Recently Node.js has become one of the most favorite tools developers choose to create new Web services or network applications. Some of the reasons are its event-driven and non-blocking I/O architecture which allow developers to create very ... [More] lightweight, efficient and highly scalable real-time applications that run across distributed servers. Node.js has been widely adopted by individual developers as well as large corporations such as LinkedIn, Yahoo!, Microsoft, and others. It has become so popular that developers have started writing and publishing so called Node Packaged Modules which further extend the functionality of the Node.js platform. In fact, there are over 17,000 registered modules at https://npmjs.org/ which have been downloaded over 12,000,000 times during the last month only. That popular the Node.js platform is. node-cubrid To allow Node.js developers to connect and work with CUBRID Database Server, we have developed the node-cubrid module and published it at NPM. node-cubrid provides a set of APIs to connect to and query CUBRID databases. Besides the database specific APIs, the module also supplies several helper APIs which are useful to sanitize and validate user input values, format and parameterize SQL statements. Compatibility node-cubrid has been developed in pure JavaScript, therefore it has no dependency on any external library. This allows users to develop CUBRID Database based Node.js applications on any Node.js compatible platform such as Linux, Mac OS X, and Windows. For the same reason node-cubrid is designed to work with any version of CUBRID RDBMS. However, for the time being it has been tested only with CUBRID 8.4.1. This is different from other CUBRID drivers such as PHP/PDO, Python, Perl, Ruby, OLEDB, and ODBC which have dynamic dependency on CUBRID C Internface (CCI). Since CUBRID is available only on Linux and Windows OS, these drivers are also limited to these platforms as well as specific CUBRID versions. However, CUBRID’s Node.js as well as ADO.NET drivers do not have any dependency, therefore can be used on any platform where that particular run-time environment is capable of running on. Installation Installing and using node-cubrid is easy. To install, one has to initiate npm install command with node-cubrid module name as an argument in the directory where a Node.js application is located. npm install node-cubrid This will install the latest version available at https://npmjs.org/. Once installed, the module can be accessed by requiring the node-cubrid module: var CUBRID = require('node-cubrid'); The node-cubrid module exports the following properties and functions: Helpers: an object which provides a set of helper functions. Result2Array: an object which provides functions to convert DB result sets into JS arrays. createDefaultCUBRIDDemodbConnection(): a function which returns a connection object to work with a local demodb database. createCUBRIDConnection(): a function which returns a connection object to work with a user defined CUBRID host and database. Request flow in node-cubrid The request flow in node-cubrid module looks as illustrated below. Because node-cubrid is developed to take the full advantage of JavaScript and Node.js programming, when executing a SQL statement in node-cubrid, developers need to listen for an EVENT_QUERY_DATA_AVAILABLE and EVENT_ERROR events, or provide a callback function which will be called once there is a response from the server. When the request is sent to the server, CUBRID executes it, and returns the response, which can be either a query result set, or the error code. It is by design that CUBRID does not return any identification about the request sender. In other words, in order to associate the response with a request, the driver has to have only one active request which can be the only owner of this response. For this reason, if a developer wants to execute several queries, they must execute them one after another, i.e. sequentially, NOT in parallel. This is how the communication between the driver and the server is implemented in CUBRID and many other database systems. If there is a vital need to run queries in parallel, developers can use connection pooling modules. We will explain this technique in the examples below. Using node-cubrid Establishing a connection First, user establishes a connection with a CUBRID server by providing a host name (default: ‘localhost’), the broker port (default: 33000), database username (default: ‘public’), password (default: empty string), and finally the database name (default: ‘demodb’). conn.connect(function (err) {    if (err) {        throw err.message;    }    else{        console.log('connection is established');                  conn.close(function () {            console.log('connection is closed');        });    }}); The above code illustrates a callback style when a function is passed as an argument to a connect() API which is called if the connection has been successfully established. Alternatively, developers can write applications based on an event-based coding style. For example, the above code can be rewritten as: conn.connect(); conn.on(conn.EVENT_ERROR, function (err) {     throw err.message; }); conn.on(conn.EVENT_CONNECTED, function () {     // connection is established     conn.close(); }); conn.on(conn.EVENT_CONNECTION_CLOSED, function () {     // connection is closed }); If you prefer the event-based coding style, refer to the Driver Event model wiki page to learn more about other events node-cubrid emits for certain API calls. Executing queries Once connected, users can start executing SQL queries. There are several APIs you can use to execute queries in node-cubrid: query(sql, callback); queryWithParams(sql, arrParamsValues, arrDelimiters, callback); execute(sql, callback); executeWithParams(sql, arrParamsValues, arrDelimiters, callback); batchExecuteNoQuery(sqls, callback); Eventually all of the above APIs execute given SQL queries. The difference is that query* APIs return data records while *execute* APIs do not return any record. So basically, you would use query* with SELECT queries while *execute* with INSERT/UPDATE/DELETE queries. Executing queries with parameters queryWithParams() and executeWithParams() APIs allow developers to bind values to parameterized SQL queries. Though “binding” in node-cubrid does not infer a communication with the server, the module merely replaces all ? placeholders with the given arrParamsValues values which are wrapped with arrDelimiters delimeters. Thus, you can bind values as follows: var code = 15214,     sql = 'SELECT * FROM athlete WHERE code = ?'; conn.queryWithParams(sql, [code], [], function (err, result, queryHandle) {      // check the error first then use the result }); The same can be done with non-result SQL statements like: var host_year = 2008,     host_nation = 'China',     host_city = 'Beijing',     opening_date = '08-08-2008',     closing_date = '08-24-2008',     sql = 'INSERT INTO olympic (host_year, host_nation, host_city, opening_date, closing_date) VALUES (?, ?, ?, ?, ?)'; conn.executeWithParams(sql, [host_year, host_nation, host_city, opening_date, closing_date], ["", "'", "'", "'", "'"], function (err) {      // check the error first }); If you need to insert multiple records at once in the form of VALUES (...), (...), ..., you can use helper functions to manually populate ? placeholders with values as shown below. var sql = 'INSERT INTO olympic (host_year, host_nation, host_city, opening_date, closing_date) VALUES ',   partialSQL = '(?, ?, ?, ?, ?)',   data = [{...}, {...}, {...}],   values = []; data.forEach(function (r) {   var valuesSQL = CUBRID.Helpers._sqlFormat(     partialSQL,     [r.host_year, r.host_nation, r.host_city, r.opening_date, r.closing_date],     ["", "'", "'", "'", "'"]   );      values.push(valuesSQL); }); sql += values.join(','); conn.execute(sql, function (err) {      // check the error first }); Fetching more data Sometimes, when quering a database, it happens that the results set is quite large that it has to be retrieve in multiple steps. Below you can see how to keep fetching more data until all data is retrieved. var sql = 'SELECT * FROM participant'; conn.query(sql, function (err, result, queryHandle) {   // assuming no error is returned // the following outputs 916     console.log(CUBRID.Result2Array.TotalRowsCount(result));          function outputResults (err, result, queryHandle) {       if (result) {         // 309 records are in the first results set           // 315 records are in the second results set           // 292 records are in the third results set             console.log(CUBRID.Result2Array.RowsArray(result).length);             // try to fetch more data             conn.fetch(queryHandle, outputResults);         }         else{           // no more result, close this query handle             conn.closeQuery(queryHandle, function (err) {                 conn.close(function () {                     console.log('connection closed');                 });             });         }     }          outputResults(err, result, queryHandle); }); The above are the APIs developers will use most of the time. Using a connection pool manager node-cubrid does not provide connection pool manager. However, at some point developers may want to execute multiple queries at the same time. In such cases, users can use generic-pool, also known as node-pool, as a pool manager for CUBRID connections. To install generic-pool type the following in the terminal. npm install generic-pool The following example shows how to configure generic-pool to create and destroy CUBRID connections. var poolModule = require('generic-pool'); var pool = poolModule.Pool({     name     : 'CUBRID',     // you can limit this pool to create maximum 10 connections     max      : 10,     // destroy the connection if it's idle for 30 seconds     idleTimeoutMillis : 30000,     log : true ,     create   : function(callback) {         var conn = CUBRID.createCUBRIDConnection('localhost', 33000, 'dba', 'password', 'demodb');         conn.connect(function (err) {           callback(err, conn);         });     },     destroy  : function(con) {       conn.close();     } }); Then, the connection pool manager can be used in your application as follows. pool.acquire(function(err, conn) {     if (err) {         // handle error - this is generally the err from your         // factory.create function       }     else {         conn.query("select * from foo", function() {             // once done querying, return the object back to pool             pool.release(conn);         });     } }); Using node-cubrid with async module node-cubrid module provides ActionQueue helper module which provides the waterfall functionality of async module. You can use ActionQueue as follows: CUBRID.ActionQueue.enqueue([     function (cb) {       conn.connect(cb);     },     function (cb) {       conn.getEngineVersion(cb);     },     function (engineVersion, cb) {       console.log('Engine version is: ' + engineVersion);       conn.query('select * from code', cb);     },     function (result, queryHandle, cb) {       console.log('Query result rows count: ' + Result2Array.TotalRowsCount(result));       console.log('Query results:');       var arr = Result2Array.RowsArray(result);       for (var k = 0; k < arr.length; k++) {         console.log(arr[k].toString());       }              conn.closeQuery(queryHandle, cb);       console.log('Query closed.');     },     function (cb) {       conn.close(cb);       console.log('Connection closed.');     }   ],   function (err) {     if (err == null) {       console.log('Program closed.');     } else {       throw err.message;     }   } ); The above is identical to async’s waterfall function shown below. async.waterfall([     function (cb) {       conn.connect(cb);     },     function (cb) {       conn.getEngineVersion(cb);     },     function (engineVersion, cb) {       console.log('Engine version is: ' + engineVersion);       conn.query('select * from code', cb);     },     function (result, queryHandle, cb) {       console.log('Query result rows count: ' + Result2Array.TotalRowsCount(result));       console.log('Query results:');       var arr = Result2Array.RowsArray(result);              for (var k = 0; k < arr.length; k++) {         console.log(arr[k].toString());       }              conn.closeQuery(queryHandle, cb);       console.log('Query closed.');     },     function (cb) {       conn.close(cb);       console.log('Connection closed.');     }   ],   function (err) {     if (err == null) {       console.log('Program closed.');     } else {       throw err.message;     }   } ); Roadmap At the time of writing this artile node-cubrid version 1.0.1 stable was the latest release. In the future version we plan to improve node-cubrid a lot to make it more convenient for developers to code. For example, developers will be able to bind values with a single object parameter. Its properties and their values will serve as column names and values in the SQL statement. Very convenient which also increases the code readability. We will also add new APIs to retrieve values set for server configuration parameters. As of version 1.0.1 node-cubrid does not return the number of affected rows after having executed write queries. This will also be implemented. In addition to this, there will be APIs to obtain table schema information which will be very benefitial for ORM developers. Besides these, in the upcoming version node-cubrid will allow to connect to a CUBRID Server using a connection URL, the same API we already provide in all other drivers. This will allow to pass a list of alternative hosts for broker level failover, specify the query timeout duration, etc. We plan to add many new functionality to node-cubrid. If you have a specific request, please create an issue in CUBRID JIRA issue tracker, or let us know by IRC, Twitter, or Facebook. We will be glad to review your request. If you have specific questions about CUBRID or node-cubrid module, you can ask at our Q&A site. [Less]
Posted over 11 years ago by Se Hoon Park
In the middle of this year a news report announced that Facebook is planning to support Google's SPDY protocol in large scale, and that they are already implementing SPDY/v2. Here is the official response from Facebook that I have found on this ... [More] topic. Among various efforts devised and suggested by Google to make the Web faster, I think SPDY will be the one to become a new industry standard, and it will be included in HTTP/2.0. As an acronym of SPeeDy, SPDY is a new protocol Google has suggested as a part of its efforts to "make the Web faster." It was suggested as a protocol to use the present and future Internet environment more efficiently by addressing the disadvantages of HTTP devised in the early Internet environment. In this article, I will provide a brief introduction to the features and merits of SPDY. I will explain about the state of SPDY support, and what to do and what to consider when it is introduced. When was the latest version of HTTP released? HTTP version 0.9 was first announced in 1991, and HTTP 1.0 and 1.1 were released in 1996 and 1999, respectively, and since then nothing has been changed in HTTP for the last 10 years. These days, however, a webpage has a size 20 times bigger with 20 times more HTTP requests than a webpage in the 1990s. The Table 1 below shows the data quoted from Google I/O 2012. Table 1: Comparison of Mean Webpage Size in 2010 and 2012. Mean page size Mean # of requests per page Mean No. of domains 2010 Nov. 15 702 KB 74 10 2012 May 5 1059 KB 84 12 The capacity of the Yahoo! main page in 1996 was 34 KB. That is only 1/30 of the mean webpage capacity in 2012. There is a significant gap even between 2010 and 2012, let alone between the 1990s and 2012. This is because the mean page size and number of requests are ever increasing as User UX is becoming more and more sophisticated along with the dissemination of high-speed Internet. The characteristics of today's webpage have changed from that in the past as follows: Consists of much more resources. Uses multiple domains. Operates more dynamically. Emphasizes security more. Considering how today's web environment is different from the past, Google has announced the SPDY protocol, which complemented the disadvantages of HTTP. SPDY focuses especially on resolving the problem of load latency. Features of SPDY The Figure 1 below shows the layers of SPDY compared to the traditional TCP/IP layer model. Figure 1: HTTP vs. SPDY. The features of SPDY can be summarized as follows: Always operates on Transport Layer Security (TLS). Transport Layer Security (TLS) is the next version of Secure Sockets Layer (SSL). TLS and SSL are sometimes used to refer to the same protocol because they are the name for two different versions of the same protocol, and this also applies to this article. Therefore, SPDY applies only to websites written with HTTPS. HTTP Header compression As HTTP headers have many redundant contents whenever a request is made, you can improve the performance significantly just by compressing headers. According to a report by Google, it is possible to reduce the size by 10-35% by compressing HTTP headers even in the initial request, and reduce the size of headers by 80-97% when requests are made several times (long-lived connection). Also, when the upload bandwidth is relatively small, as in mobile devices, this HTTP header compression is more useful. These days, as the HTTP header is 2 KB on average and is growing bigger, the merit of compressing HTTP headers is expected to grow in the future. Binary protocol As it uses binary framing rather than text-based framing, it provides faster parsing and is less sensitive to errors. Multiplexing SPDY handles multiple independent streams in a single connection concurrently. For this reason, unlike HTTP which handles one request at a time in a single connection with response to requests made consecutively, SPDY handles multiple requests and responses concurrently with a small number of connections. In addition, unlike HTTP pipelining in which if one response is delayed, the others are all delayed, SPDY handles each request and response independently as it uses a FIFO queue. Full-duplex interleaving and stream prioritization As SPDY allows interleaving in which one stream in a process can be interleaved with another and stream prioritization, data of higher priority can jump into the process of transportation of data of lower priority and can be transported earlier. Server push Servers can push content without client requests unlike Comet and Long-polling. Unlike methods such as inlining, SPDY supports resource caching and uses the same bandwidth as that of inlining or a smaller bandwidth. When implementing server push, however, you need to implement additional Web server application logic. No need to re-write a website Except for some features that require additional implementation, such as server push, you don't need to change a website itself to apply SPDY. However, the browser and the server should support SPDY. SPDY can be applied completely and transparently to browser users. In other words, there is no protocol scheme like "spdy://." The browser also displays nothing with regard to the use of the SPDY protocol. Based on such characteristics of SPDY, the following table shows the difference between HTTP and SPDY. Table 2: HTTP/1.1 vs. SPDY. HTTP SPDY Secure Not Default Default Header Compression No Yes Multiplexing No Yes Full-Duplex No Yes Prioritization No, (instead, a browser employs heuristics.) Yes Server push No Yes DNS Lookup More Less Connections More Less Therefore, it can be said that SPDY is a protocol designed to use TCP connection more efficiently by improving the data transfer format and connection management of HTTP. As a result of such efforts, in a test with the top 25 websites, SPDY worked 39-55% faster compared to HTTP + SSL. Why does SPDY need TLS? Why does SPDY use TLS even though using TLS causes latency due to encryption/decryption? Google's SPDY Whitepaper states the following answer for this question:  "In the long term, the importance of web security will be emphasized more and more, and thus we want to get a better security in the future by specifying TLS as a sub protocol of SPDY. We need TLS for compatibility with the current network infrastructure. In other words, we need it to prevent any compatibility issue with the communication going through the existing proxy.” Despite this reason, when looking into the implementation of the actual SPDY, you can see that SPDY depends much on TLS's Next Protocol Negotiation (NPN) extension mechanism. This TLS NPN extension mechanism determines whether a request coming from Port 443 is SPDY or not and identifies the version of SPDY used by the request, to decide whether to use SPDY to handle the next communication. Without TLS NPN, you will need to get additional RTT to use SPDY. Efforts for Standardization  SPDY is being developed as an open networking protocol and has been suggested to IETF as a method of HTTP/2.0. SPDY is a sub-project of the Google Chromium project, and thus Chromium client implementations and server tools are all being developed with open sources. Future of SPDY Most recently SPDY Draft 3 was released, and SPDY Draft 4 is under development. The features likely to be added in Draft 4 are as follows:  Name resolution push Certificate data push Explicit proxy support The final goal of SPDY is to provide a page within "a single connection setup time + bytes/bandwidth time". Browsers, Servers, Libraries and Web Services Supporting SPDY Currently a variety of browsers and servers support SPDY, and Google, who originally suggested SPDY, already provides almost all of its services with SPDY. The browsers, servers, libraries and services supporting SPDY are as follows. Browsers Supporting SPDY As of July 2012, the following is a list of browsers which support the SPDY protocol. Google Chrome/ChromiumChrome and Chromium have supported SPDY from their initial version. If you enter the following URI, you can inspect SPDY sessions in Chrome/Chromium: chrome://net-internals/#events&q=type:SPDY_SESSION%20is:active. For example, if you visit www.gmail.com, in the chrome:// tab you can see multiple SPDY sessions being created. Android mobile Chrome also supports SPDY. Firefox 11 and later versionsWhile added in version 11, SPDY was not enabled by default until Firefox version 13. If you enter about:config in FF, which is the URI for Firefox settings, and see network.http.spdy.enabled, you can check whether support for SPDY has been enabled or not. Android mobile Firefox 14 also supports SPDY. Amazon SilkThe Silk browser equipped in Kindle Fire, an Android-based e-book reader from Amazon, also supports SPDY. It communicates with Amazon EC2 service by using SPDY. Default browser of Android 3.0 and higherThe default browser of Android 3.0 (Honeycomb) and 4.0 (Ice Cream Sandwich) supports SPDY. For more information about which browsers support SPDY, see http://caniuse.com/spdy. Servers and Libraries Supporting SPDY Support for SPDY is being vitalized mainly by major web servers and application servers, and a variety of libraries that implement SPDY are also being developed. Nginx Nginx released a beta version of SPDY module on June 15, 2012, and has continuously provided patches. See SPDY: 146% faster Slideshare presentation by Nginx team to learn more about it. JettyJetty also provides the SPDY module. ApacheThe SPDY module for Apache 2.2 is also being developed. LibrariesIn addition, SPDY implementation structures for Python, Ruby and node.js servers have already been developed or are currently being developed. There are a variety of versions for SPDY C library, including libspdy, spindly and spdylay. A library to use SPDY on iOS is also being developed. NettyNetty began to provide SPDY package from its 3.3.1 version released in 2012. TomcatSPDY support in Tomcat is currently under development and it should come in Tomcat version 8. Services Using SPDY As mentioned earlier, Google has already converted almost all of its services, including search, Gmail and Google plus, into HTTPS, and provides them through SPDY. In addition, when Google App Engine uses HTTPS, it also supports SPDY. Twitter also uses SPDY when providing service via HTTPS. However, among numerous Web sites on the Internet, only a few websites use SPDY. According to a survey by Netcraft conducted in May 2012, of a total of 660 million websites only 339 websites are currently using SPDY. In other words, except for Google and Twitter, there is hardly any major website using SPDY. When SPDY is Not Very Efficient SPDY is not always fast. Sometimes you may not get any performance improvement with SPDY. Such situations are as follows. When using only HTTPAs SPDY always requires SSL, you need additional SSL handshake time. Therefore, when you convert an HTTP site into HTTPS to support SPDY, you may not obtain distinct performance improvement due to SSL handshake. When there are too many domainsSPDY operates by domain. This means that it requires as many connections as the number of domains, and that request multiplexing is available only in a single domain. Moreover, as it is difficult to make all domains support SPDY, you may not get the merits of SPDY when there are too many domains. Especially when a CDN does not support SPDY, you may not expect the performance improvement with SPDY. When HTTP is not the bottleneckFor most pages, HTTP is not the bottleneck. For example, in cases when a resource can be downloaded only after another resource is downloaded, SPDY will not be that effective. When Round-Trip-Time (RTT) is lowSPDY is more efficient when RTT is high. When RTT is very low, for example, in communications between servers within IDC, SPDY has few merits. When resource in a page is very smallFor pages with six or fewer resources, SPDY has few merits because the value of reusing connection is not significant. Things to Do to Introduce SPDY  When you introduce SPDY, you need to carry out the following tasks to apply it most efficiently. Application Level Use only one connectionFor better SPDY performance and more efficient use of Internet resources, you need to use as few connections as possible. If you use a small number of connections when using SPDY, you can see the benefits such as putting data into packets in a better way, getting better header compression efficiency, reducing the frequency of checking the status of connection and reducing the frequency of handshake. Also, in terms of Internet resources, with a small number of connections, you can also have a more efficient TCP and reduce Bufferbloat. Bufferbloat is a phenomenon in which excess buffering of packets at a router or a switch causes high latency and reduced throughput. With the idea of avoiding packet discards as much as possible and low memory prices, the buffer size of a router or a switch is continuously increasing. The packets that should have been discarded earlier could survive longer and as a result these packets hinder TCP congestion avoidance algorithm, deteriorating the overall network performance. Avoid domain shardingDomain sharding is a kind of expedient used to avoid the restriction on the number of concurrent downloads (in general, six downloads per hostname in modern browsers) in web applications. If you use SPDY and comply with "use a single connection" recommendation, you don't need to use domain sharding. To make it worse, domain sharding causes additional DNS queries and makes applications more complex. Use server push instead of inliningInlining stylesheets or scripts is often used to reduce the number of HTTP requests, thus RTT, in web applications. However, inlining makes Web pages less cacheable and increases the size of webpages due to base64 encoding. If you use the SPDY server push feature used to push content, you can avoid these problems. Use request prioritizationYou can enable the client to inform the server of the relative priority of resources by using the request prioritization feature of SPDY. The simple common heuristic prioritization could be html > js, css > *. Choose the proper size of a SPDY frameAlthough SPDY spec allows large frames, sometimes a small frame is more desirable. This is because a small frame allows interleaving to work better. SSL Level Use a smaller, full certificate chainThe size of a certificate chain makes a huge influence on the performance of the initialization of a connection. If there are more certificates in a certificate chain, it will take more time to verify the validity of certificates, and more space will be occupied in initcwnd. initcwnd: the initial value used in TCP initial congestion window and TCP congestion control algorithm. The congestion window is a sender-side window to control the size according to TCP congestion control algorithm, which is different from the TCP window, which is a receiver-side limit on the size.  In addition, if a server does not provide a full certificate chain, the client will use additional RTT to get certificates in the middle. Therefore, if a large, incomplete certificate chain is used, it will take longer time for an application to use the connection. Use a wildcard certificate (e.g., *.naver.com) if possibleIf you use wildcard certificates, you can reduce the number of connections and use the connection sharing of SPDY. As a wildcard certificate is provided by a certification institute, however, you may need to discuss with the institute and pay extra costs. Do not set the size of SSL write buffer too largeIf SSL write buffer is too large, TLS application record will be put on multiple packets. As an application can process only after the entire TLS application record is completed, the record put on multiple packets will cause additional latency. Google servers use 2 KB buffers. TCP Level Set the initcwnd of the server to at least 10Initcwnd is the main bottleneck that affects the initial loading time of a page. If you use only HTTP, you can avoid this problem by attaining the initial congestion window size of n × initcwnd by opening multiple connections concurrently. As a single connection is advantageous in SPDY, however, it is better to set initcwnd to a large value from the beginning. This value in old Linux kernels is fixed to 2-3, and the method to adjust this value is not provided. As this value was determined according to the reliability and bandwidth of TCP network when it was first considered, it is not suitable to today's TCP network with higher stability and bandwidth. The method to adjust this value was added in Linux Kernel 3.0, and the latest Linux kernels already use the default value of 10 or higher. Disable tcp_slow_start_after_idleThe tcp_slow_start_after_idle on Linux is set to 1. This causes a congestion window to return to the size of initcwnd when the connection goes idle, and makes TCP Slow Start restart. TCP Slow Start is an algorithm that works by sending packets to the congestion window of the initcwnd size and increasing the TCP congestion window up to the maximum value allowed by the network or up to the TCP window of the receiver side. If the initcwnd value is small, it takes more round-trips until the window reaches the maximum size allowed by the network, and as a result, the initial page loading time will increase. As this will eliminate the advantage of a single connection of SPDY, this setting should be disabled. You can change the setting by using the sysctl command. It is also advantageous to disable this setting when using HTTP keepalive. If SPDY is Really Introduced … In conclusion, to actually introduce SPDY, you need to consider a variety of matters and modify applications and servers. You cannot ignore the costs of introducing the protocol, either. For a Web application written for Tomcat, which does not yet support SPDY, you should consider the cost required to change the Web application server, as well as the cost required to implement the server push functionality. Except for the costs required to introduce SPDY, what should we take into account first when we introduce SPDY in a real service? I chose three matters to consider. The service should be the one that already uses HTTPSYou cannot get many advantages when introducing SPDY for services using only HTTP. You should pay the costs for introducing SSL as well. You should be able to change the Linux kernelEven CentOS 6.3, the latest version released on July 9, 2012, still uses the kernel 2.6.32. Adjustment of initcwnd is supported only from the kernel 3.0, and you need to change the kernel if possible because the performance improvement you can get by adjusting initcwnd is very significant. Consider the ratio of users of SPDY supported browsersIn Korea, many users still use IE which does not support SPDY. On mobile devices, iOS does not yet support SPDY, while Android 3.0 or higher supports SPDY. Therefore, until there are sufficient users of SPDY-supported browsers, you should carefully compare the advantages you can get from the performance improvement derived from SPDY with the costs required to introduce the protocol. By Sehoon Park, Senior Engineer at Web Platform Development Lab, NHN Corporation. [Less]
Posted over 11 years ago by Dongsoon Choi
This is the fourth article in the series of "Become a Java GC Expert". In the first issue Understanding Java Garbage Collection we have learned about the processes for different GC algorithms, about how GC works, what Young and Old Generation is ... [More] , what you should know about the 5 types of GC in the new JDK 7, and what the performance implications are for each of these GC types. In the second article How to Monitor Java Garbage Collection we have explained how JVM actually runs the Garbage Collection in the real time, how we can monitor GC, and which tools we can use to make this process faster and more effective. In the third article How to Tune Java Garbage Collection we have shown some of the best options based on real cases as our examples that you can use for GC tuning. Also we have explained how to minimize the number of objects passed to Old Area, decreasing Full GC time, as well as how to set GC type and the memory size. In this fourth article I will explain the importance of MaxClients parameter in Apache that significantly affects the overall system performance when GC occurs. I will provide several examples through which you will understand the problem MaxClients value causes. I will also explain how to reliably set the proper value for MaxClients depending on the available system memory. The effect of MaxClients on the system The operation environment of NHN services has a variety of Throttle valve-type options. These options are important for reliable service operation. Let's see how the MaxClients option in Apache affects the system when Full GC has occurred in Tomcat. Most developers know that "stop the world (STW) phenomenon" occurs when GC has occurred in Java (for more refer to Understanding Java Garbage Collection). In particular, Java developers at NHN may have experienced faults caused by GC-related issues in Tomcat. Because Java Virtual Machine (JVM) manages the memory, Java-based systems cannot be free of the STW phenomenon caused by GC. Several times a day, GC occurs in services you have developed and currently operate. In this situation, even if TTS caused by faults does not occur, services may return unexpected 503 errors to users. Service Operation Environment For their structural characteristics, Web services are suitable for scale-out rather than scale-up. So, generally, physical equipment is configured with Apache  * 1 + Tomcat * n according to equipment performance. However, this article assumes an environment where Apache * 1 + Tomcat * 1 are installed on one host as shown in Figure 1 below for a convenient description.   Figure 1: Service Operation Environment Assumed for the Article. For reference, this article describes options in Apache 2.2.21 (prefork MPM), Tomcat 6.0.35, jdk 1.6.0_24 on CentOS 4.72 (32-bit) environment. The total system memory is 2 GB and the Garbage Collector uses ParallelOldGC. The AdaptiveSizePolicy option is set to true by default and the heap size is set to 600m. STW and HTTP 503 Let's assume that requests are flowing into Apache at 200 req/s and more than 10 httpd processes are running for service, even though this situation may depend on response time for requests. In this situation, assuming that the pause time at full GC is 1 second, what will happen if full GC occurs in Tomcat? The first thing that hits your mind is that Tomcat will be paused by full GC without responding to all requests being processed. In this case, what will happen to Apache while Tomcat is paused and requests are not processed? While Tomcat is paused, requests will continuously flow into Apache at 200 req/s. In general, before full GC occurs, responses to requests can be sent quickly by the service with only 10 or more httpd processes. However, because Tomcat is paused now, new httpd processes will continuously be created for new inflowing requests within the range allowed by the MaxClients parameter value of the httpd.conf file. As the default value is 256, it will not care that the requests are inflowing at 200 req/s. At this time, how about the newly created httpd processes? Httpd processes forwards requests to Tomcat by using the idle connections in the AJP connection pool managed by the mod_jk module. If there is no idle connection, it will request to create new connections. However, because Tomcat is paused, the request to create new connections will be rejected. Therefore, these requests will be queued in the backlog queue as many as the size of the backlog queue, set in the AJP Connector of the server.xml file, allows. If the number of queued requests exceed the size of the backlog queue, a Connection is Refused error will be returned to Apache and Apache will return the HTTP 503 error to users. In the assumed situation, the default size of backlogs is 100 and the requests are flowing in at 200 req/s. Therefore, more than 100 requests will receive the 503 error for 1 second of the pause time caused by full GC. In this situation, if full GC is over, sockets in the backlog queue are retrieved by Tomcat's acceptance and assigned to worker threads within the range allowed by MaxThreads (defaults to 200) in order to process requests. MaxClients and backlog In this situation, which option should be set in order to prevent the 503 error to users?  First, we need to understand that the backlog value should be enough to accept requests flowing to Tomcat during the pause time in full GC. In other words, it should be set to at least 200 or greater.  Now, is there a problem in such configuration? Let's repeat the above situation under the assumption that the backlog setting value has been increased to 200. This result is more serious as shown below. The system memory usage is typically 50%. However, it rapidly increases to almost 100% when full GC occurs, causing a rapid increase of swap memory usage. Moreover, because the pause time of full GC increases from 1 second to 4 or more seconds, the system is down for that time and cannot respond to any requests. In the first situation, only 100 or more requests received the 503 error. However, after increasing the backlog size to 200, more than 500 requests will be hung for 3 or more seconds and cannot receive responses. This situation is a good example that shows more serious situations which may occur when you do not precisely understand the organic relations between settings, i.e., their impact on the system. Then, why does this phenomenon occur? The reason is the characteristics of the MaxClients option. Setting the MaxClients value to a generous value does not matter. The most important thing in setting the MaxClients option is that the total memory usage should be calculated not to exceed 80% even though httpd processes are created as much as the maximum MaxClients value. The swappiness value of the system is set to 60 (default). As such, when memory usage exceeds 80%, swap will actively occur. Let's see why this characteristic causes the more serious situation described above. When requests are flowing in at 200 req/s and Tomcat is paused by full GC, the backlog setting value is 200. Approximately, an additional 100 httpd processes can be created in Apache above the first case. In this situation, when the total memory usage exceeds 80%, the OS will actively use the swap memory area, and objects for GC will be moved from the old area of JVM to the swap area since the OS considers them unused for a long period. Finally, when the swap area is used in GC, the pause time will rapidly increase. So, the number of httpd processes will increase, causing 100% of memory usage and the situation previously described will occur. The difference between the two cases is only the backlog setting values: 100 vs. 200. Why did this situation occur only for 200? The reason for the difference is the number of httpd processes created in these configurations. When the setting value is set to 100 and, full GC occurs, 100 requests for creating new connections are created and then are queued in the backlog queue. The other requests receive the connection refused error message and return the 503 error. Therefore, the number of total httpd processes will be slightly more than 100. When the value is set to 200, then 200 requests for creating new connections can be accepted. Therefore, the number of total httpd processes will be more than 200 and the value exceeds the threshold that determines the occurrence of memory swap. Then, by setting the MaxClients option without considering the memory usage, the number of httpd processes rapidly increases with full GC, causing swap and degradation of the system performance. If so, how can we determine the MaxClients value, what is the threshold value for the current system situation? Calculation Method of MaxClients Setting As the total memory of the system is 2 GB, the MaxClients value should be set to use no more than 80% of the memory (1.6 GB) in any situation in order to prevent performance degradation caused by the memory swap. In other words, the 1.6 GB memory should be shared and allocated to Apache, Tomcat, and agent-type programs, which are installed by default. Let's assume that the agent-type programs, which are installed in the system by default, occupy the memory at about 200 m. For Tomcat, the heap size set to -Xmx is 600m. Therefore, Tomcat will always occupy 725m (Perm Gen + Native Heap Area) based on the top RES (see the figure below). Finally, Apache can use 700m of the memory. Figure 2: Top Screen of Test System. If so, what should the value of MaxClients be with a memory of 700m? It will be different according to the type and the number of loaded modules. However, for NHN Web services, which use Apache as a simple proxy, 4m (based on top RES) will be enough for one httpd process (see Figure 2). Therefore, the maximum MaxClients value for 700m should be 175.  Conclusion Reliable service configuration should decrease the system downtime under overload and send successful responses to requests within the allowable range. For Java-based Web services, you must check whether the service has been configured to reliably respond to the STW under full GC. If the MaxClients option is set to a large value, to respond to simple increase of user requests and against DDoS attacks, without considering the system memory usage, it loses its functionality as a throttle valve, causing bigger faults. In this example, the best way to solve the problem is to expand the memory or the server, or set the MaxClients option to 175 (in the above case) so that Apache returns the 503 error to requests that only exceed 175. The situation in this article occurs within 3 to 5 seconds, so it cannot be checked by most monitoring tools which run at regular sampling intervals. By Dongsoon Choi, Senior Engineer at Game Service Technical Support Team, NHN Corporation. [Less]
Posted over 11 years ago by Esen Sagynov
For those who have been using CUBRID 8.4.1 for production but wanted to get the hands on the latest features available in CUBRID 9.0 such as native support for Database Sharding, there is great news! We are announcing the immediate availability of ... [More] CUBRID 8.4.3, the release fully compatible with 8.4.1 which introduces Database Sharding and API level Load Balancing features among others to 8.4.x family! You can download CUBRID 8.4.3 from the Downloads page. Further I will explain in details all about the new features introduced in CUBRID 8.4.3. Database ShardingCUBRID SHARD, which was first announced in CUBRID 9.0, is now available in CUBRID 8.4.3. CUBRID SHARD provides environmental convenience when processing a large volume of data by facilitating the access to horizontally partitioned databases across multiple servers. The CUBRID SHARD feature provides a single view that shows databases, which are spread across multiple nodes, as a single database, and transparency that allows users to recognize them without accessing individual databases. CUBRID SHARD is a great feature, and we are happy to include it in CUBRID 8.4.3. Another great feature of CUBRID SHARD is that it supports MySQL as a backend database alongside CUBRID. For more information about this, read our previous blog where you can find a Slideshare presentation with more details. Driver FeaturesAPI level Load BalancingIn CUBRID 8.4.3 we have added a new feature to CCI and JDBC drivers which allows applications to connect to the main host or the hosts specified by althosts parameter in the connection URL in a random order. In the following example of a connection URL, this functionality is activated by setting the value of loadBalance parameter to true. jdbc:cubrid:host1:port1:demodb:::?althosts=host2:port2,host3:port3&loadBalance=true When loadBalance is true the driver will randomly choose a host among those specified in the connection URL except the one which was used to connect the last time, then the driver will try to connect to this randomly chosen host. If the chosen host is not available, the selection will continue until all the hosts are determined as unavailable. In such case, the driver will report an error. Thus, CUBRID now provides two level Load Balancing: Server level when the CUBRID Broker balances the requests among multiple CUBRID Application Servers (CAS). API level when the driver randomly chooses a host to connect to, thus significantly offloading the main host server. For more information refer to cci_connect_with_url() API. New API functionsIn addition to the Load Balancing we have also added two more APIs to our CCI driver. cci_close_query_result() closes the result set returned by cci_execute(), cci_execute_array(), or cci_execute_batch() functions, thus allows to decrease the memory consumption at a given time. cci_escape_string() escapes all unsafe characters so that the string can be safely used within an SQL statement. Extended SQL In CUBRID 8.4.3 we have also added the INET_ATON and INET_NTOA SQL functions. These functions are available in CUBRID 9.0 as well. The INET_ATON function returns a numeric value when an IPv4 address is entered, while the INET_NTOA function returns an IPv4 address value when numbers are entered. SELECT INET_ATON('192.168.0.10'); inet_aton('192.168.0.10') ============================ 3232235530   SELECT INET_NTOA(3232235530); inet_ntoa(3232235530) ====================== '192.168.0.10' New Monitoring FunctionalityQuery execution statusCUBRID 8.4.3 now supports a new -q option in cubrid killtran command line utility which displays the execution time in seconds of those queries which are running at this moment of the time. This utility now shows information about: the transaction index the client process ID the client program name the total execution time of a query being executed in seconds the total execution time of the current transaction in seconds the list of transactions which hold the lock when the current transaction is in lock waiting the query statement being executed (up to 30 characters) Logging slow queriesIn CUBRID 8.4.3 we added a new system parameter sql_trace_slow_msec which allows to log query statements, which have exceeded the specified execution time, together with their query execution plan. When the value of sql_trace_slow_msec parameter is yes, the SQL statement, its query execution plan, and the cubrid statdump information are recorded in the server error log file and the broker application server (CAS) log file. When the cubrid plandump is executed, the corresponding SQL statement and the query execution plan are displayed. However, the corresponding information is recorded in the server error log file only when the value of the error_log_level parameter is NOTIFICATION. Now using this system parameter you can find slow queries and tune them faster. API level debugging logsRelated to the above slow queries logging feature, in CUBRID CCI driver we have added several new connection URL parameters which can be used to instruct the driver whether or not slow queries must be logged. We have previously blogged about a similar feature in 8.4.1 which allows to log generic debugging logs. Now you can separately logs slow queries. We have added: logSlowQueries and slowQueryThresholdMillis parameters for this purpose which write slow query log information logTraceApi writes the beginning and the end of the called CCI functions logTraceNetwork writes the network data transmission information of a CCI function to a file. For example: url ="cci:cubrid:localhost:33000:demodb:::?logSlowQueries=true&slowQueryThresholdMillis=1000&logTraceApi=true&logTraceNetwork=true" dd New Configurations In CUBRID 8.4.3 we have added the check_peer_alive system parameter to set whether to check if the database server process (cub_server) and the client process that connected to the database server process have run normally or not. The types of client processes include the broker application server (cub_cas) process, the replication log reflection server (copylogdb), the replication log copy process (applylogdb), and the CSQL interpreter (csql). When a server process and a client process do not receive any response for a long time (e.g., 5 seconds or longer) after they have been connected while waiting for the data via the network, they check if the opponent operates normally or not depending on this configuration parameter. If they decide that the opponent does not operate normally, they force to disconnect the connection. Built-in CUBRID Web Manager This is one of the most important features we have added to CUBRID 8.4.3. CUBRID Web Manager (CWM) is the next generation SQL client with monitoring features that we announced two months ago. At the time CWM was not part of CUBRID but was distributed separately just like other CUBRID Tools. However, today we are happy to announce that from now on CWM will be distributed together with CUBRID Engine. This means that once you start CUBRID Service with cubrid service start command line utility, CWM will be started together and will be listening to secure (HTTPS) 8282 port (default but configurable). Just open your browser, navigate to https://localhost:8282 (or the remote host) and you can start managing and monitoring your CUBRID Server in real time with CUBRID Web Manager. This is awesome! For more information about each of these new features, refer to CUBRID 8.4.3 Release Notes. If you have questions, feel free to ask on our dedicated Q&A site, forum, Twitter, Facebook, Google+, #cubrid IRC channel, or contact us by email. We will be glad to answer you! [Less]
Posted over 11 years ago by Esen Sagynov
Today I would like to introduce the online tool we have recently found and started to use to write and manage the Release Notes for CUBRID Database. I will explain about the tool we used to use and the difficulties we had to deal with before ... [More] adopting the new solution. Then I will explain about the features of the new tool which I think will be very beneficial to manual and documentation writers. Originally we have been writing and distributing Release Notes in MS Word and have it edited and reviewed by multiple users. It provides nice changes tracker with features to accept or reject changes. Formatting in MS Word is very advanced, which is probably the biggest reason why we prefer it over other editing tools. References management in MS Word is great, too! You can also export documents to a PDF format very easily. This is what we used to do to distribute CUBRID Release Notes. MS Word is a really great tool if you need to manage your document in this way. But what was continuously tedious for us with this solution is the transformation to the Web format. As most of you may know, MS Word adds tons of garbage tags when exporting to the HTML format. It's very difficult to clean them up. Moreover, the exported format does not seamlessly integrate with your existing site style, provides no mobile optimized view, so you end up either spending much time cleaning MS Word output to fit your existing site design, or simply adding a link to a PDF file. Alternatively you would look for a solution which would export MS Word document to HTML with consistent styles and provide you with similar editing experience. We did much research on this and found 3Rabbitz Book, a 100% Web-based Enterprise Authoring Tool. About 3Rabbitz Book 3Rabbitz Book is a shareware online editing tool ideally designed for multi-author environment to work with long documents and manuals. It is free for personal usage up to 3 simultaneous editors. You can review the pricing on their site. The great news is that the 3Rabbitz Book developers also provide a free license to open source projects and non-profit organizations. If you are after a commercial license, they provide a 30-day free trial as well. With 3Rabbitz Book we started to spend more time writing the actual content rather than spending time on editing an MS Word document, converting to HTML, then cleaning up the HTML output every time we need to update the document. 3Rabbitz Book provides many great features. I will try to explain some of them below. Web-based editing toolSince it work in a browser, writers are not limited to working on a particular computer or operating system where the editing tool is installed. Wherever there is an Internet connection, you can make changes and publish your document to the Web instantly with 3Rabbitz Book. Collaborative writing3Rabbitz Book can be used by multiple authors at the same time and their work can be revised and then merged into one. 3Rabbitz Book provides a convenient collaborative environment for authors. Dividing a manual into chaptersIt is very inconvenient for multiple authors to write a manual with a word processor. This is because they should repeat the cumbersome process of splitting a file into many parts among writers, then writing each part separately, and then collecting them back into one file through a file server or mail. In this process, contents are sometimes omitted, and it takes effort to standardize inconsistent forms. With 3Rabbitz Book authors can divide the document into multiple parts/chapters. Each author fills out the selected content. However, unlike MS Word, you don't need to collect individual parts from all authors and compile them manually into a single document. For example, you can organize a table of contents as shown below. Figure 1: Dividing the manual among authors. Paragraph level editing3Rabbitz Book provides paragraph level editing which has the following advantages: Multiple authors can edit the same chapter simultaneously. You can do structural writing. You can manage the history of changes. You can separate content and form. Also the editor minimizes the use of a mouse and provides a variety of shortcuts so authors can use only a keyboard. Figure 2: 3Rabbitz Book Editor. Single source publicationOne of the great features we love in 3Rabbitz Book is the publish once and distribute in multiple formats feature. On your site you can provide a link to your site visitors to export the manual to PDF, EPUB, or other formats suitable for Web, print and mobile devises without additional editing overhead. Awesome! Exporting to HTMLOther manual writing tools like MS Word also support "Export to HTML" feature, but like I said before, they have problems: They support only limited HTML. You cannot change forms, or have to go through a complex process to accomplish that. You may need to export XML first instead of HTML and invest additional time and effort to create proper HTML through another tool. Some display HTML within frames while the table of contents is provided in the parent document for navigation. Search engines, such as Google, may fail to search such output as HTML content is loaded in AJAX. Whenever you update HTML, you must upload HTML files on the web server via FTP. To carry this out, you may have to use addition software apart from a content writing tool. On the contrary, with 3Rabbitz Book you can write HTML and publish it on the web and make any changes to it in a real time. You can also design a form of the Web viewer (an output viewer) in a variety of ways and they are search engine friendly. The main components of the Web viewer are as follows: Displays the order of titles, figures and tables. Provides a built-in search field. Displays an index. An index will appear only when there is an index type chapter. Provides links to Download PDF and EPUB formats. Figure 3: Web Viewer. Exporting to PDFWhile many Wiki products do not support Export to PDF feature or are unable to create a single finished PDF file, 3Rabbitz Book enables you to create a high quality PDF file composed of all parts of the manual in the right order, including table of contents, figures, tables, appendix, etc. Unlike other manual writing tools, you don't need to separately upload a file to a Web server to generate a PDF file. The following example is a screenshot of a PDF file created with 3Rabbitz Book. Check out the preciseness of the PDF it produces. Figure 4: PDF output of 3Rabbitz Book. Figure 5: TOC in PDF file. Figure 6: Manual Chapters in PDF generated by 3Rabbitz Book. Exporting to EPUBExporting to EPUB is a really cool feature which is very crucial in this mobile heavy world. Unlike other manual writing tools, you don't need to upload an EPUB file to a Web server to distribute it to users. The manual author can generate an EPUB file which will be automatically saved on the server. The following examples are screenshots of an EPUB document from an iPad created with 3Rabbitz Book. Figure 7: EPUB file generated by 3Rabbitz Book. Figure 8: Clean EPUB pages auto-generated by 3Rabbitz Book. Separating contents and formsTo create a good manual, you must maintain a consistent look-and-feel. When you use a word processor, it takes a lot of effort to maintain a consistent form. The form varies depending on who writes a particular chapter or when it was written. Unless a single person takes the responsibility to standardize different forms on a regular basis, the gaps in these forms will keep increasing and inconsistent numbering and references to headers may occur while combining files. On the contrary, 3Rabbitz Book separates contents and forms thoroughly. In 3Rabbitz Book you cannot insert form information while writing a content. Instead of configuring a paragraph indent or a margin between paragraphs, you can select a type, give a significance and configure a form in a lump as a theme. This is called a paragraph type. Figure 9: Setting a Paragraph Type. You can also configure a form in a lump as a theme after assigning a significance, rather than configuring a font color or bold to a certain string. This is called a character type. Figure 10: Configuring a Character Type. When viewing a content on the Web viewer or creating a PDF file, if you select a theme and a layout, the final document is produced according to the configured paragraph and character types. In the form menu comprising of themes and layouts you can design the form of the Web viewer, PDF and EPUB files in a variety of ways. Figure 11: Selecting a Theme and a Layout in Exporting PDF. History management You can store the entire editing history and maintain documents safely even during a collaborative editing through the feature of history comparison and deleted paragraph restoration. Figure 12: History Comparison. Putting Callouts on Pictures with Visual Editor When you write a software manual, sometimes you need to put callouts to screenshots. You can do it simply with the Visual Editor. Figure 13: Visual Editor. Also with the Visual Editor you can write the following content by using the "callout list" paragraph type and the "callout" character type. Figure 14: Paragraph Callouts in 3Rabbitz Book. Automation and Reuse 3Rabbitz Book supports automation for a variety of items. Creates a PDF title page and the first page automatically. Creates the order of titles, tables and figures automatically. Creates indexes automatically. When you create multiple manuals with similar contents, some redundant contents will occur. 3Rabbitz Book provides the feature to reuse them effectively. Can reuse content in the unit of chapters. Can reuse content in the unit of sections or paragraphs. Can refer to image paragraphs. Can use themes and layouts in multiple manuals. Easy Installation and Update You can install 3Rabbitz Book simply by decompressing the setup file and configuring options with the Web-based Installation Wizard on a variety of OSs, including Windows, Linux, Unix and Mac. As 3Rabbitz Book is a server application developed in Java, it supports those operating systems which can run Java 6 or higher versions. Figure 15: Installation Wizard. After checking for a new release, you can update 3Rabbitz Book simply with a few clicks and a server restart. Figure 16: Software Update. Conclusion I hope 3Rabbitz Book can be a satisfactory solution for authors and projects out there who could not find a way to overcome the inefficiencies in writing a manual whenever a new product is released or a service is upgraded. We are very thankful to 3Rabbitz Book developers for providing us a free license. For more information about using 3Rabbitz Book, see the Installation Guide or the User Guide. You can also write an email to [email protected] if you would like to directly communicate with the developers. We also found another solution called FrameMaker by Adobe. But we could not try as it costs 999 USD. If you have any experience with this tool or any other tool, please share your experience in the comments below. [Less]
Posted over 11 years ago by Esen Sagynov
Six months have passed since we have started our program and we would like to announce the results as well as names of developers who will receive first batch of donations from CUBRID open source project. The CUBRID Affiliates Program was a great ... [More] success for sure! Originally we had planned to reward $3000 in total to top 3 projects. Over the course of the last six months we have discovered many useful open source applications written in various programming languages. Some of them have already become our affiliates by adding CUBRID Database support to their applications. Considering the effort all individual developers have put into adding CUBRID support in their own or third-party applications (yes, we did accept developers who contributed CUBRID code to other open source projects), we have decided to reward all of them with at least $150 donation. Depending on the effort it took as well as additional contributions (issue reports, questions asked, blog posts, tutorials) made by each developer, the donation ranges from 150 USD to 500 USD. We would like to thank all of the developers for their effort and willingness to support CUBRID open source database project one way or another. We are also extremely happy to be able to support them in return! No. Application Name  Developer  Country Donation ($) 1  jOOQ  Lukas Eder Switzerland  500 2  SIDU Topnew Geo Australia  400 3 SOFA Statistics  Grant Paton-Simpson New Zealand 300 4  ART, Quartz  Timothy Anyona Kenya  150 + 150 5 Brig   Andrew Naplavkov Russia  150 6 DataCleaner   Kasper Sørensen  Denmark 150 7 GestDB   Arsenio Molinero  Spain 150 8  JMyETL  Xiong He  China 150 9  JWhoisServer  Klaus Zerwes  Germany 150 10 Kalkun   Azhari Harahap  Indonesia 150 11 RedBeanPHP  Gabor de Mooji  Nederland 150 12 Sequel Jeremy Evans  USA  150 13 Tadpole for DB  HyunJong Cho  South Korea 150 14 QDepo  Stefan Suciu   Romania 150 As you can see in the above table, the first 3 applications will receive donations of $500, $400 and $300 respectively. The rest contributors will receive a $150 donation each. The exception is the ART project where the developer Timothy Anyona has also contributed to Quartz Scheduler project. Thanks to him Quartz Scheduler now supports CUBRID Database. The 14 applications that have become our affiliates come from all over the world, from 4 different continents and 14 different countries. In the image below, you can see how CUBRID has reached developers on a worldwide scale. CUBRID Affiliates Program does NOT stop on this. We will continue to accept new open source projects as our Affiliates. If you develop or contribute to an open source project and you want to add CUBRID Database support, we will be very glad to help you and see you onboard! Please contact us by email to [email protected] and let us know about your project. We will be guiding you all the time by answering your questions, resolving encountered issues, and simply talking (join our IRC). Remember that CUBRID provides over 90% SQL compatibility with MySQL, so porting an app to CUBRID should be straightforward. Once again thank you Lukas, Topnew, Grant, Timothy, Andrew, Kasper, Arsenio, Xiong He, Klaus, Azhari, Gabor, Jeremy, HyunJong, and Stefan for supporting our project! We will contact you by email regarding how you can receive the donation. If you have questions, feel free to ask on our dedicated Q&A site, forum, Twitter, Facebook, Google+, #cubrid IRC channel, or contact us by email. We will be glad to answer you! [Less]
Posted over 11 years ago by Esen Sagynov
October is full of great news for CUBRID community! We have released a new Node.js driver for CUBRID to allow developers, who love both JavaScript and event-driven programming, to work with CUBRID, the most optimized database for Web applications. ... [More] We have updated most of our drivers and tools which now provide extended functionality and improved stability. We were invited to speak at HighLoad++ Developers Conference in Moscow, Russia, where we have introduced new features of CUBRID Database for horizontal scalability. And today I am extremely happy to announce a new major CUBRID 9.0 release which provides users with tons of valuable features, significantly increased performance, and native support for database sharding. New Features Multilingual SupportCUBRID 9.0 provides extended support for character sets, collations, calendars and number notations of various languages. Now besides English and Korean we support Japanese, Chinese, Vietnamese, Cambodian, Turkish, German, Spanish, French, and Italian. Extended SQLPrior to version 9.0, CUBRID has already provided over 90% SQL compatibility with MySQL. In this new version we have further extended SQL syntax for even more compatibility with MySQL and Oracle databases. Added: Analytics functions OVER aggregate functions MERGE statement support JOIN support in DELETE/UPDATE statements ENUM data type Pseudo columns (SYSDATE and USER) as DEFAULT values. Stand-alone VALUES clause OFFSET keyword in LIMIT clause LEVEL pseudo column in hierarchical queries INET_ATON and INET_NTOA functions Now when migrating data to the latest version of CUBRID, most of SQL statements written in your applications will stay intact. Moreover, CUBRID Migration Toolkit will automatically match most of MySQL/Oracle/CUBRID data types to the most appropriate one in CUBRID 9.0, thus provide seamless migration experience. To learn more about each of these extensions, refer to 9.0 Release Notes. Advanced IndexingWe have added several very important indexing techniques which significantly improve the performance of READ operations. Function based indexing allows to include function expressions in columns comprising an index. Filtered indexing allows to include search conditions in an index. Index skip scan optimization allows users to use a multi-column index from its second column even when the first column is not specified. This all add up to optimizations we have introduced in previous versions. They are: Index types Reverse Index Unique Index Primary Key Foreign Key Query Optimizations Multi-range limit optimization Key limit optimization Skip ORDER BY Skip GROUP BY Prefix Index Index range scan optimization Covering Index Descending Index Query Rewrites Server level optimizations Shared Query Plan Cache Locking Optimizations Transaction Concurrency Log compression Database Sharding Support Database Sharding is one of the most important and massive features that open source CUBRID RDBMS provides. Last week at Highload++ conference in Moscow we presented about CUBRID SHARD. The presentation slides are available below. Thus, with a native support for High-Availability (HA) with fail-over feature CUBRID provides the best set of RDBMS functionality for large scale Web applications. For further reading about CUBRID SHARD, refer to the following blog articles and CUBRID Manual: Database Sharding the Right Way: Easy, Reliable, and Open source - HighLoad++ 2012 Database Sharding with CUBRID Database Sharding Platform at NHN Meet CUBRID Database Developers at Russian Internet Technologies 2012 Conference CUBRID SHARD Manual Additional featuresSome other new features in CUBRID 9.0 are: Cursor Holdability support Improved Error Messages Improved stability and performanceOverall Engine performance In CUBRID 9.0 we have significantly improved the overall performance of the CUBRID Engine as well as its stability. The throughput and response time of CUBRID 9.0 have been improved for more than 300% when compared to a previous version. Figure 1: The number of read/write requests per second of SysBench benchmark. Figure 2: The average execution time per request of SysBench benchmark. Figure 3: The accumulated number of transactions of SysBench benchmark. Improved READ performance To measure the effectiveness of new Indexing types and optimizations on READ performance, we have run a basic performance test on CUBRID 8.4.1 and 9.0. SELECT performance in the new version has increased for more than 160% while the performance of WRITE operations is remained on the same level. Figure 4: Performance Comparison between R4.1 Patch 6 and 9.0 Beta (Linux 32-bit). Improved stability and performance of Partitioning In CUBRID 9.0 we have fundamentally enhanced the Partitioning feature for better stability and performance. In addition, we have added support for PROMOTE statement which allows users to promote a specific partition from a partitioned table to a general table. Improved HA Stability and Operating Convenience HA is the flagship feature in CUBRID since version 8.2.0. In this new version we have fixed many stability issues in CUBRID HA. This version also provides a separate control for the HA management process and easier dynamic addition and deletion of nodes in the HA management process. So CUBRID 9.0 is the best RDBMS we have ever released so far with tons of valuable features, significantly increased performance, and native support for database sharding. We have updated most of APIs that you can use to connect to CUBRID 9.0. You can download PHP, PDO, Python, Ruby, Perl, ADO.NET, OLEDB, ODBC, JDBC, C, and Node.js drivers from http://www.cubrid.org/downloads. For more information about new features and improvements in CUBRID 9.0, refer to Release Notes. If you have questions, feel free to ask on our dedicated Q&A site, forum, Twitter, Facebook, Google+, #cubrid IRC channel, or contact us by email. We will be glad to answer you! [Less]