Posted on

Korean postpositions

Korean has postpositions, which complicate things for Relevanssi. Fortunately, it’s easy to clean up the most common postpositions from the words. Add this function to your site:

add_filter( 'relevanssi_stemmer', 'relevanssi_korean_plural_stemmer' );
function relevanssi_korean_plural_stemmer( $term ) {
    $len  = strlen( $term );
    $end1 = substr( $term, -1, 1 );
    if ( '은' === $end1 && $len > 2 ) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '에' === $end1 && $len > 2 ) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '는' === $end1 && $len > 2 ) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '이' === $end1 && $len > 2 ) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '가' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '을' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '를' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '와' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '과' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '로' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '으로' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '도' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '만' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '처럼' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    elseif ( '의' === $end1 && $len > 2) {
        $term = substr( $term, 0, -1 );
    }
    return $term;
}

After you’ve added this function, rebuild the index. You also need to adjust the minimum word length to 2, as many Korean words are only two characters long.

Cheonmu created this function and posted it at the Relevanssi support forums.

Leave a Reply