From f1a2e3e954f2c54db94206a842cc4e87dbaa5c50 Mon Sep 17 00:00:00 2001
From: "Brian S. O'Neill"
-Carbonado uses shutdown hooks to make sure that all in progress transactions
-are properly rolled back. If you hard-kill a process (kill -9), then the
-shutdown won't get run. This can cause a problem when using BDB, and
-db_recover must be run to prevent future data corruption. BDB-JE is not
-affected, however, as it automatically runs recovery upon restart.
-
-By running a resync operation programmatically. The ReplicatedRepository has a
-ResyncCapability which has a single method named "resync". It accepts a
-Storable class type, a throttle parameter, and an optional filter. Consult the
-Javadocs for more info.
-
-In your application you might find it convenient to add an administrative
-command to invoke the resync operation. This makes it easy to repair any
-inconsistencies that might arise over time.
-
-A lock timeout may be caused by an operation that for whatever reason took too
-long, or it may also indicate a deadlock. By default, Carbonado uses a lock
-timeout of 0.5 seconds for BDB based repositories. It can be changed by calling
-setLockTimeout on the repository builder.
-
-Deadlocks may be caused by:
-
-
-
-
-In the first case, applications usually cause deadlock situations by updating a -record within the same transaction that previously read it. This causes the -read lock to be upgraded to a write lock, which is inherently deadlock -prone. To prevent this problem, switch the transaction to update mode. This -causes all acquired read locks to be upgradable, usually by acquiring a write -lock from the start. -
--Another cause of this deadlock is when you iterate over a cursor, updating -entries as you go. To resolve this, either copy the cursor entries to a list -first, or operate within a transaction which is in update mode. -
--The second case, BDB page split, is a problem in the regular BDB product. -It is not a problem with BDB-JE. When inserting records into a BDB, -it may need to rebalance the b-tree structure. It does this by splitting a leaf -node and updating the parent node. To update the parent node, a write lock must -be acquired but another thread might have a read lock on it while trying to -lock the leaf node being split. -
--There is no good solution to the BDB page split deadlock. Instead, your -application must be coded to catch deadlocks are retry transactions. They are -more common when filling up a new BDB. -
--The third case, index updates, is caused by updating a record while another -thread is using the index for finding the record. Carbonado's indexing strategy -can be coded to defer index updates when this happens, but it currently does -not. In the meantime, there is no general solution. -
--Lock timeouts (or locks not granted) may be caused by: -
--
-If any transactions are left open, then any locks it acquired don't get -released. Over time the database lock table fills up. When using BDB, the -"db_stat -c" command can show information on the lock table. Running -"db_recover" can clear any stuck locks. To avoid this problem, always run -transactions within a try-finally statement and exit the transaction in the -finally section. -
--By default, BDB transactions have REPEATABLE_READ isolation level. This means -that all locks acquired when iterating cursors within the transaction are not -released until the transaction exits. This can cause the lock table to fill -up. To work around this, enter the transaction with an explicit isolation level -of READ_COMMITTED which releases read locks sooner. -
--Applications that have a high number of active threads can cause general lock -contention. BDB-JE uses row-level locks, and so lock timeouts caused by -contention are infrequent. The regular BDB product uses page-level locks, thus -increasing the likelyhood of lock contention. -
- - - --Carbonado query filters do not support subselects, athough it can be -emulated. Suppose the query you wish to execute looks something like this in -SQL: -
--select * from foo where foo.name in (select name from bar where ...) --
-This can be emulated by querying bar, and for each result, fetching foo. -
--// Note that the query is ordered by name. -Cursor<Bar> barCursor = barStorage.query(...).orderBy("name").fetch(); -String lastNameSeen = null; -while (barCursor.hasNext()) { - Bar bar = barCursor.next(); - if (lastNameSeen != null && lastNameSeen.equals(bar.getName()) { - continue; - } - lastNameSeen = bar.getName(); - Foo foo = fooStorage.query("name = ?").with(lastNameSeen).tryLoadOne(); - if (foo != null) { - // Got a result, do something with it. - ... - } -} --
-For best performance, you might want to make sure that Foo has an index on its name property. -
--You may track the feature request here. -
--No, it does not. Carbonado instead requires that your Storable definition -matches a table, if using the JDBC repository. When using a repository that has -no concept of tables, like the BDB repositories, the Storable is the canonical -definition. In that case, changes to the Storable effectively change the -"table". In addition, properties can be added and removed, and older records -can still be read. -
--Although it is technically feasible for Carbonado to support generating SQL -tables, Storable definitions are not expressive enough to cover all the -features that can go into a table. For example, you cannot currently define a -foreign key constraint in Carbonado. -
--Carbonado treats nulls as ordinary values wherever possible, so nothing special -needs to be done. That is, just search for null like any other value. The query -call might look like: -
--Query<MyType> query = storage.query("value = ?").with(null); -Cursor<MyType> = query.fetch(); -... --
-When using the JDBC repository, the generated SQL will contain the "IS NULL" -phrase in the WHERE clause. -
--To see the SQL statements generated by the JDBC repository, you can install a -JDBC DataSource that logs all activity. Provided in the JDBC repository package -is the LoggingDataSource class, which does this. As a convenience, it can be -installed simply by calling setDataSourceLogging(true) on the -JDBCRepositoryBuilder. -
--Alternatively, you can call Query.printNative(), which by default prints the -native query to standard out. When using the JDBC repository, this will print -the SQL statement. -
--Carbonado version 1.1 has thin support for MySQL. Version 1.2 (in the 1.2-dev branch) -supports an @Automatic annotation for supporting MySQL auto-increment columns. -
--The @AlternateKeys annotation is provided specifically for this purpose. Both -@PrimaryKey and @AlternateKeys define unique indexes. The only real difference -is that there can be only one primary, but many alternate keys are allowed. -
--Carbonado does not require repository implementations to perform any -caching. If you're using just the JDBC repository, there's no cache. A general -purpose caching repository was in development, but it was shelved because there -was no immediate need for it. The replicated repository however, can be -considered to be a complete cache. -
--The only built in caching is for join properties on Storable instances. It just -lazily sets the join result to an internal field of the Storable instance. The -join property value is not shared with other Storable instances. -
--The Carbonado join cache is a lazy-read cache, local to a Storable instance. It -is not a global write-through cache, and so no flushing is necessary. -
--The first time a join property has been accessed, a reference is saved in the -master Storable instance. This optimization makes the most sense when filtering -based on a join property. The query loads the join property, and you'll likely -want it too. This avoids a double load. -
- --Independent repositories like BDB support automatic schema evolution. You may -freely add or remove non-primary key properties and still load older -storables. Changes to primary key properties is not supported, since they -define a clustered index. Also, property data types cannot be changed except if -a boxed property is changed to a non-boxed primitive and vice versa. -
--Every storable persisted by Carbonado in BDB starts with a layout version, -which defines the set of properties encoded. Carbonado separately persists the -mapping from layout version to property set, such that when it decodes a -storable it knows what properties to expect. -
--When adding or removing properties, existing persisted storables are not -immediately modified. If you remove a property and add it back, you can recover -the property value still encoded in the existing storables. Property values are -not fully removed from an existing storable instance until it is explicitly -updated. At this time, the layout version used is the current one, and the -removed property values are lost. -
--When loading a storable which does not have a newly added property, the -property value is either null, 0, or false, depending on the data type. You can -call the isPropertyUninitialized method on the storable to determine if this -default property value is real or not. -
--In order to change a property type to something that cannot be automatically -converted, the change must be performed in phases. First, define a new -property, with a different name. Then load all the existing storables and -update them, setting the new property value. Next, remove the old property. To -potentially free up storage you can update all the storables again. If you wish -the newly added property to retain the original name, follow these steps again -in a similar fashion to change it. -
--Given a repository and an appropriately set classpath, can we iterate through -all the various storables held in the repository without actually knowing what -the repository might hold in advance? -
--Repositories that implement StorableInfoCapability provide this -functionality. The reason it's a capability is that some repos (JDBC) don't have -a registry of storables. BDB based ones do, and so this capability works for that. -
--The short answer is no -- index integrity is ensured automatically. More details follow: -
--When using the JDBC repository, it is up to the database vendor to ensure that -insert/update/delete operations include index updates within an implicit -auto-commit transaction. All the major database vendors do this properly -already, so nothing special needs to be done here. -
--When using a BDB backed repository, it is up to Carbonado to ensure implicit -transactions are used. Carbonado sets up BDB to be in transaction mode, and -there's no Carbonado level config to disable this. So you're always using BDB -with transactions, and that is good. When you do a lone Carbonado -insert/update/delete operation, it will pass null to BDB for the transaction -object, which implies auto-commit. BDB will automatically enter a tiny -transaction to protect that little change. -
--If the Storable you're updating has any indexes on it, a Carbonado trigger is -installed that updates the affected indexes when you do an -insert/update/delete. The presence of the trigger changes how the -auto-generated Storable behaves. The insert/update/delete operation enters a -transaction automatically, and it doesn't commit until all triggers have -run. Index updates are therefore guarded by transactions, even if you don't -explicitly specify one. In addition, all changes made by your own triggers are -always guarded by a transaction. -
--The cursor iteration and delete operations must be enclosed in the same -transaction. Auto-commit delete while iterating over a cursor fails for some -databases, BDB and BDB-JE in particular. Although BDB supports a delete -operation on the cursor itself, the transaction requirement remains. -
--A workaround exists when using BDB-JE, which works only due to its use of -record-level locks. Calling Cursor.hasNext() forces the cursor to move past the -current record, releasing the lock on the record to be deleted. BDB native uses -page locks, so this trick will only work in the occasional case that the next -record is on another page. -
--The BDB-JE cursor implementation could be changed to automatically move to the -next record, but this reduces portability. Also, the cursor should not move -past the current record automatically if in a transaction. It would allow -another thread to sneak in and modify the record. An isolation level of -repeatable read would be required to keep the lock. -
-