Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
Welcome To Ask or Share your Answers For Others

Categories

0 votes
728 views
in Technique[技术] by (71.8m points)

algorithm - Frequency of each element of an array considering all contiguos subarrays

Consider an array A = [5,1,7,2,3]

All contiguos subarrays = { [5], [1], [7], [2], [3], [5,1], [1,7], [7,2], [2,3], [5,1,7], [1,7,2], [7,2,3], [5,1,7,2], [1,7,2,3], [5,1,7,2,3] }

Replace all the arrays in the above set with maximum element in it:

set will look like this: { [5], [1], [7], [2], [3], [5], [7], [7], [3], [7], [7], [7], [7], [7], [7] }

Frequency info: [5] -> 2, [1] -> 1, [7] -> 9, [2] -> 1, [3] -> 2

My Goal is to find the above frequency info.

My Approach:

First make a list of pair (x,y). x is element in A and its index is y.

LIST : [ (5,1), (1,2), (7,3), (2,4), (3,5) ]

Sort the list in decreasing order with respect to first element.Now,

LIST : [ (7,3), (5,1), (3,5), (2,4), (1,2) ]

Algorithm:

def f( array, first_index, last_index):
       ->select an element from LIST starting from left which
         is not marked as visited and (first_index <= element.second <=   
         last_index)
       ->calculate frequency info of element in tuple as  (element.secondvalue-first_index+1) + (element.secondvalue-first_index+1)*(last_index - element.second_value)
       ->Mark above element as visited
       ->Recursively solve f( array, first_index,element.secondvalue-1 ),f( array,element.secondvalue+1,last_index)    

We can easily set suitable base case.

Time complexity:O(n*n)

I have tried a lot to come with the above algorithm but unable to improve time complexity.How I can do so?Any hint,approach will be appreciated.

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome To Ask or Share your Answers For Others

1 Answer

0 votes
by (71.8m points)

Based on Paul's answer, I implemented a O(n log n) version using a simplified version of a segment tree. You'll notice that this answer is the same as Paul's, but with optimized versions of leftctsmaller and rightctsmaller.

In practice, what it does is to take an array, let's say:

A = [5,1,7,2,3,7,3,1]

And construct an array-backed tree that looks like this:

segment tree

In the tree, the first number is the value and the second is the index where it appears in the array. Each node is the maximum of its two children. This tree is backed by an array (pretty much like a heap tree) where the children of the index i are in the indexes i*2+1 and i*2+2.

Then, for each element, it becomes easy to find the nearest greater elements (before and after it).

For example, to find the nearest greater element to the left, we go up in the tree searching for the first parent where the value is greater and the index is less than the argument. The answer must be a child of this parent, then we go down in the tree looking for the rightmost node that satisfies the same condition.

I've implemented it in Python (as well as the na?ve version, to check the answer), and it seems to work well.

import sys, random
from collections import defaultdict
from math import log, ceil

def make_tree(A):
    n = 2**(int(ceil(log(len(A), 2))))
    T = [(None, None)]*(2*n-1)

    for i, x in enumerate(A):
        T[n-1+i] = (x, i)

    for i in reversed(xrange(n-1)):
        T[i] = max(T[i*2+1], T[i*2+2])

    return T

def print_tree(T):
    print 'digraph {'
    for i, x in enumerate(T):
        print '    ' + str(i) + '[label="' + str(x) + '"]'
        if i*2+2 < len(T):
            print '    ' + str(i)+ '->'+ str(i*2+1)
            print '    ' + str(i)+ '->'+ str(i*2+2)

    print '}'

def find_generic(T, i, fallback, check, first, second):
    j = len(T)/2+i
    original = T[j]
    j = (j-1)/2

    #go up in the tree searching for a value that satisfies check
    while j > 0 and not check(T[second(j)], original):
        j = (j-1)/2

    #go down in the tree searching for the left/rightmost node that satisfies check
    while j*2+1<len(T):
        if check(T[first(j)], original):
            j = first(j)
        elif check(T[second(j)], original):
            j = second(j)
        else:
            return fallback

    return j-len(T)/2


def find_left(T, i, fallback):
    return find_generic(T, i, fallback, 
        lambda a, b: a[0]>b[0] and a[1]<b[1],  #value greater, index before
        lambda j: j*2+2,                       #rightmost first
        lambda j: j*2+1                        #leftmost second
    ) 


def find_right(T, i, fallback):
    return find_generic(T, i, fallback,
        lambda a, b: a[0]>=b[0] and a[1]>b[1], #value greater or equal, index after
        lambda j: j*2+1,                       #leftmost first
        lambda j: j*2+2                        #rightmost second
    )       

def optimized_version(A):
    T = make_tree(A)

    answer = defaultdict(lambda: 0)
    for i, x in enumerate(A):
        left = find_left(T, i, -1)
        right = find_right(T, i, len(A))
        answer[x] += (i-left) * (right-i)

    return dict(answer)

def naive_version(A):
    answer = defaultdict(lambda: 0)
    for i, x in enumerate(A):
        left = next((j for j in range(i-1, -1, -1) if A[j]>A[i]), -1)
        right = next((j for j in range(i+1, len(A)) if A[j]>=A[i]), len(A))
        answer[x] += (i-left) * (right-i)
    return dict(answer)


A = [random.choice(xrange(32)) for i in xrange(8)]    
MA1 = naive_version(A)
MA2 = optimized_version(A)

sys.stderr.write('Array:     ' + str(A) + '
')
sys.stderr.write('Naive:     ' + str(MA1) + '
')
sys.stderr.write('Optimized: ' + str(MA2)  + '
')
sys.stderr.write('OK:        ' + str(MA1==MA2)  + '
')
#print_tree(make_tree(A))

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
Welcome to WuJiGu Developer Q&A Community for programmer and developer-Open, Learning and Share
...