diff options
| author | Brian S. O'Neill <bronee@gmail.com> | 2006-10-24 05:06:11 +0000 | 
|---|---|---|
| committer | Brian S. O'Neill <bronee@gmail.com> | 2006-10-24 05:06:11 +0000 | 
| commit | c52e4c055a9fa0f59a1ca19538d88377b51de44b (patch) | |
| tree | d29293b8a1e6cd491ae0c5d74eedba1885386705 /src/site/fml | |
| parent | 79347bcbe76acba5e4b9d94ce25d2d445fe7670f (diff) | |
Added technical FAQ.
Diffstat (limited to 'src/site/fml')
| -rw-r--r-- | src/site/fml/technical-faq.fml | 326 | 
1 files changed, 326 insertions, 0 deletions
| diff --git a/src/site/fml/technical-faq.fml b/src/site/fml/technical-faq.fml new file mode 100644 index 0000000..74aee55 --- /dev/null +++ b/src/site/fml/technical-faq.fml @@ -0,0 +1,326 @@ +<?xml version="1.0"?>
 +<faqs id="FAQ" title="Frequently Asked Technical Questions">
 + <part id="General">
 +
 +   <faq id="process-killed">
 +     <question>What happens when a Carbonado process is killed while in the middle of a transaction?</question>
 +     <answer>
 +       <p>
 +Carbonado uses shutdown hooks to make sure that all in progress transactions
 +are properly rolled back. If you hard-kill a process (kill -9), then the
 +shutdown won't get run. This can cause a problem when using BDB. In either
 +case, db_recover must be run to prevent future data corruption. BDB-JE is not
 +affected, however, as it automatically runs recovery upon restart.
 +       </p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="replicated-bootstrap">
 +     <question> How do I bootstrap a replicated repository?</question>
 +     <answer>
 +<p>
 +By running a resync operation programmatically. The ReplicatedRepository has a
 +ResyncCapability which has a single method named "resync". It accepts a
 +Storable class type, a throttle parameter, and an optional filter. Consult the
 +<a href="http://carbonado.sourceforge.net/apidocs/com/amazon/carbonado/capability/ResyncCapability.html">Javadocs</a> for more info.
 +</p>
 +<p>
 +In your application you might find it convenient to add an administrative
 +command to invoke the resync operation. This makes it easy to repair any
 +inconsistencies that might arise over time.
 +</p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="deadlock">
 +     <question> I sometimes see lock timeout errors and deadlocks, what is going on?</question>
 +     <answer>
 +<p>
 +A lock timeout may be caused by an operation that for whatever reason took too
 +long, or it may also indicate a deadlock. By default, Carbonado uses a lock
 +timeout of 0.5 seconds for BDB based repositories. It can be changed by calling
 +setLockTimeout on the repository builder.
 +</p>
 +<p>
 +Deadlocks may be caused by:
 +</p>
 +<p>
 +<ol>
 +<li>Application lock acquisition order</li>
 +<li>BDB page split</li>
 +<li>Index update</li>
 +</ol>
 +</p>
 +<p>
 +In the first case, applications usually cause deadlock situations by updating a
 +record within the same transaction that previously read it. This causes the
 +read lock to be upgraded to a write lock, which is inherently deadlock
 +prone. To prevent this problem, switch the transaction to update mode. This
 +causes all acquired read locks to be upgradable, usually by acquiring a write
 +lock from the start.
 +</p>
 +<p>
 +Another cause of this deadlock is when you iterate over a cursor, updating
 +entries as you go. To resolve this, either copy the cursor entries to a list
 +first, or operate within a transaction which is in update mode.
 +</p>
 +<p>
 +The second case, BDB page split, is a problem in the regular BDB product.
 +It is not a problem with BDB-JE. When inserting records into a BDB,
 +it may need to rebalance the b-tree structure. It does this by splitting a leaf
 +node and updating the parent node. To update the parent node, a write lock must
 +be acquired but another thread might have a read lock on it while trying to
 +lock the leaf node being split.
 +</p>
 +<p>
 +There is no good solution to the BDB page split deadlock. Instead, your
 +application must be coded to catch deadlocks are retry transactions.  They are
 +more common when filling up a new BDB.
 +</p>
 +<p>
 +The third case, index updates, is caused by updating a record while another
 +thread is using the index for finding the record. Carbonado's indexing strategy
 +can be coded to defer index updates when this happens, but it currently does
 +not. In the meantime, there is no general solution.
 +</p>
 +<p>
 +Lock timeouts (or locks not granted) may be caused by:
 +</p>
 +<p>
 +<ol>
 +<li>Failing to exit all transactions</li>
 +<li>Open cursors with REPEATABLE_READ isolation</li>
 +<li>Heavy concurrency</li>
 +</ol>
 +</p>
 +<p>
 +If any transactions are left open, then any locks it acquired don't get
 +released. Over time the database lock table fills up. When using BDB, the
 +"db_stat -c" command can show information on the lock table. Running
 +"db_recover" can clear any stuck locks. To avoid this problem, always run
 +transactions within a try-finally statement and exit the transaction in the
 +finally section.
 +</p>
 +<p>
 +By default, BDB transactions have REPEATABLE_READ isolation level. This means
 +that all locks acquired when iterating cursors within the transaction are not
 +released until the transaction exits. This can cause the lock table to fill
 +up. To work around this, enter the transaction with an explicit isolation level
 +of READ_COMMITTED which releases read locks sooner.
 +</p>
 +<p>
 +Applications that have a high number of active threads can cause general lock
 +contention. BDB-JE uses row-level locks, and so lock timeouts caused by
 +contention are infrequent. The regular BDB product uses page-level locks, thus
 +increasing the likelyhood of lock contention.
 +</p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="subselect">
 +     <question>How do I perform a subselect?</question>
 +     <answer>
 +<p>
 +Carbonado query filters do not support subselects, athough it can be
 +emulated. Suppose the query you wish to execute looks something like this in
 +SQL:
 +</p>
 +<p><pre>
 +select * from foo where foo.name in (select name from bar where ...)
 +</pre></p>
 +<p>
 +This can be emulated by querying bar, and for each result, fetching foo.
 +</p>
 +<p><pre>
 +// Note that the query is ordered by name.
 +Cursor<Bar> barCursor = barStorage.query(...).orderBy("name").fetch();
 +String lastNameSeen = null;
 +while (barCursor.hasNext()) {
 +    Bar bar = barCursor.next();
 +    if (lastNameSeen != null && lastNameSeen.equals(bar.getName()) {
 +        continue;
 +    }
 +    lastNameSeen = bar.getName();
 +    Foo foo = fooStorage.query("name = ?").with(lastNameSeen).tryLoadOne();
 +    if (foo != null) {
 +        // Got a result, do something with it.
 +        ...
 +    }
 +}
 +</pre></p>
 +<p>
 +For best performance, you might want to make sure that Foo has an index on its name property.
 +</p>
 +<p>
 +You may track the feature request <a href="http://sourceforge.net/tracker/index.php?func=detail&aid=1578197&group_id=171277&atid=857357">here</a>.
 +</p>
 +    </answer>
 +   </faq>
 +
 +   <faq id="table-generation">
 +     <question>Does Carbonado support generating SQL tables from Storables?</question>
 +     <answer>
 +<p>
 +No, it does not. Carbonado instead requires that your Storable definition
 +matches a table, if using the JDBC repository. When using a repository that has
 +no concept of tables, like the BDB repositories, the Storable is the canonical
 +definition. In that case, changes to the Storable effectively change the
 +"table". In addition, properties can be added and removed, and older records
 +can still be read.
 +</p>
 +<p>
 +Although it is technically feasible for Carbonado to support generating SQL
 +tables, Storable definitions are not expressive enough to cover all the
 +features that can go into a table. For example, you cannot currently define a
 +foreign key constraint in Carbonado.
 +</p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="isnull">
 +     <question>How do I query for "IS NULL"?</question>
 +     <answer>
 +<p>
 +Carbonado treats nulls as ordinary values wherever possible, so nothing special
 +needs to be done. That is, just search for null like any other value. The query
 +call might look like:
 +</p>
 +<p><pre>
 +Query<MyType> query = storage.query("value = ?").with(null);
 +Cursor<MyType> = query.fetch();
 +...
 +</pre></p>
 +<p>
 +When using the JDBC repository, the generated SQL will contain the "IS NULL"
 +phrase in the WHERE clause.
 +</p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="sql-debugging">
 +     <question>How do I see generated SQL?</question>
 +     <answer>
 +       <p>
 +To see the SQL statements generated by the JDBC repository, you can install a
 +JDBC DataSource that logs all activity. Provided in the JDBC repository package
 +is the LoggingDataSource class, which does this. As a convenience, it can be
 +installed simply by calling setDataSourceLogging(true) on the
 +JDBCRepositoryBuilder.
 +       </p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="jdbc-indexes">
 +     <question>What happens if JDBC repository cannot get index info?</question>
 +     <answer>
 +       <p>
 +The JDBC repository checks if the Storable alternate keys match those defined
 +in the database. To do this, it tries to get the index info. If the user
 +account does not have permissions, a message is logged and this check is
 +skipped. This should not cause any harm, unless the alternate keys don't
 +match. This can cause unexpected errors when using the replicated repository.
 +       </p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="mysql-increment">
 +     <question>How do I use MySQL auto-increment columns?</question>
 +     <answer>
 +       <p>
 +As of 2006-10-23, Carbonado MySQL support is very thin. The @Sequence
 +annotation is intended to be used for mapping to auto-increment columns, if the
 +database does not support proper sequences. Until support is added,
 +auto-increment columns will not work.
 +       </p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="unique">
 +     <question>Can I do the equivalent of a "unique" constraint?</question>
 +     <answer>
 +       <p>
 +The @AlternateKeys annotation is provided specifically for this purpose. Both
 +@PrimaryKey and @AlternateKeys define unique indexes. The only real difference
 +is that there can be only one primary, but many alternate keys are allowed.
 +       </p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="join-cache">
 +     <question>How does one manually flush the Carbonado join cache?</question>
 +     <answer>
 +
 +       <p>
 +The Carbonado join cache is a lazy-read cache, local to a Storable instance. It
 +is not a global write-through cache, and so no flushing is necessary.
 +</p>
 +<p>
 +The first time a join property has been accessed, a reference is saved in the
 +master Storable instance. This optimization makes the most sense when filtering
 +based on a join property. The query loads the join property, and you'll likely
 +want it too. This avoids a double load.
 +       </p>
 +
 +     </answer>
 +   </faq>
 +
 +   <faq id="evolution">
 +     <question>How can schemas evolve?</question>
 +     <answer>
 +       <p>
 +Independent repositories, like BDB support automatic schema evolution. You may
 +freely add or remove non-primary key properties and still load older
 +storables. Changes to primary key properties is not supported, since they
 +define a clustered index. Also, property data types cannot be changed except if
 +a boxed property is changed to a non-boxed primitive and vice versa.
 +</p>
 +<p>
 +Every storable persisted by Carbonado in BDB starts with a layout version,
 +which defines the set of properties encoded. Carbonado separately persists the
 +mapping from layout version to property set, such that when it decodes a
 +storable it knows what properties to expect.
 +</p>
 +<p>
 +When adding or removing properties, existing persisted storables are not
 +immediately modified. If you remove a property and add it back, you can recover
 +the property value still encoded in the existing storables. Property values are
 +not fully removed from an existing storable instance until it is explicitly
 +updated. At this time, the layout version used is the current one, and the
 +removed property values are lost.
 +</p>
 +<p>
 +When loading a storable which does not have a newly added property, the
 +property value is either null, 0, or false, depending on the data type. You can
 +call the isPropertyUninitialized method on the storable to determine if this
 +default property value is real or not.
 +</p>
 +<p>
 +In order to change a property type to something that cannot be automatically
 +converted, the change must be performed in phases. First, define a new
 +property, with a different name. Then load all the existing storables and
 +update them, setting the new property value. Next, remove the old property. To
 +potentially free up storage you can update all the storables again. If you wish
 +the newly added property to retain the original name, follow these steps again
 +in a similar fashion to change it.
 +</p>
 +     </answer>
 +   </faq>
 +
 +   <faq id="iterate-all">
 +     <question>How do I iterate over all storable types in a repository?</question>
 +     <answer>
 +       <p>
 +Given a repository and an appropriately set classpath, can we iterate through
 +all the various storables held in the repository without actually knowing what
 +the repository might hold in advance?
 +</p>
 +<p>
 +Repositories that implement StorableInfoCapability provide this
 +functionality. The reason its a capability is that some repos (JDBC) don't have
 +a registry of storables. BDB based ones do, and so this capability works.
 +</p>
 +     </answer>
 +   </faq>
 +
 + </part>
 +</faqs>
 | 
