如何让这个自定义工作表的初始化速度更快?自定义、更快、初始化、速度

由网友(孤独的霸气)分享简介:摘要 这个问题在某种程度上跟进这个问题:How实施列从索引自命名? 经测试,在这上面的链接问题的答案提供的code,我终于遇到了一个严重的性能问题。性能问题 性能问题发生在一个表的初始化,也就是,当我初始化表的单元格。'''<总结>'''初始化Company.Project.Sheet类的一个实例。''...

摘要

这个问题在某种程度上跟进这个问题: How实施列从索引自命名?

经测试,在这上面的链接问题的答案提供的code,我终于遇到了一个严重的性能问题。

  

性能问题

性能问题发生在一个表的初始化,也就是,当我初始化表的单元格。

 '''<总结>
    '''初始化Company.Project.Sheet类的一个实例。
    '''< /总结>
    '''< PARAM NAME =nativeSheet>用于初始化本机的工作表< /参数>
    朋友的Sub New(BYVAL nativeSheet作为Microsoft.Office.Interop.Excel.Worksheet)
        _nativeSheet = nativeSheet
        昏暗的细胞的IDictionary(串,ICELL)=新词典(串,ICELL)()

        这些迭代伤害了API的表现......
        对于rowIndex位置作为整数= 1到_nativeSheet.Rows.Count第1步
            对于colIndex作为整数= 1到_nativeSheet.Columns.Count第1步
                尺寸C作为ICELL =新小区(_nativeSheet.Cells(rowIndex位置,colIndex))
                cellules.Add(c.Name,C)
            下一个
        下一个

        _cellules =新ReadOnlyDictionary(串,ICELL)(细胞)
    结束小组
 
Excel 快速添加多个工作表,这个技能让你告别逐个重命名

      ReadOnlyDictionary(TKEY的中,TValue):   自定义只读字典,简单地包装了的IDictionary(TKEY的,TValue),以prevent修改。         

讨论

我正在通过这种方式,因为在底层US preadsheet工作表中每个单元从工作表的初始化初始化才结束,也就是说,当工作表被处置,或者最终确定。因此,同样的方式,我想初始化一个工作表的单元格,但我也想继续使用该索引的细胞在命名(A1)细胞的性能提升,同时保持易用性的API用户指与它的名字的单元格,这是我打算如何使用字典,所以,当我指的是单元格A1,我访问该钥匙插进我的字典,并解决小区(1,1),相应

除此之外,我知道一个更快的方法使用返回所有使用的细胞变成一个二维矩阵Worksheet.UsedRange物业工作表来读取。 如果有,无论如何相同或大致相同的一组细胞,我可以用我的初始化Cell类的多个实例,这将是巨大的,而高性能的!

我也想过像只有100×100的矩阵单元初始化在内存中映射它们与我的字典里,因为人会很少使用全片的细胞。因此,我还是思维方式,我将不得不进入一个没有初始化的细胞,让我们说细胞(120,120)。理想情况下,我认为,该计划将必须初始化最大的最初初始化细胞(100,100)之间的所有细胞,直至细胞(120,120)。我是不够清楚吗?随意问澄清! =)

另一种选择可能是,我只初始化细胞的名字到字典中,保持有行和列的索引在内存中,而不是初始化细胞的实例,其nativeCell,说一个范围。这是我的Cell类的code来说明我的意思。

''' '''重新presents在工作表的单元格。 ''' ''' 友元类细胞     实现ICELL

 私人_nativeCell作为Microsoft.Office.Interop.Excel.Range
私人_name作为字符串

'''<总结>
'''初始化Company.Project.Cell类的新实例。
'''< /总结>
'''&所述; PARAM NAME =nativeCell>将Microsoft.Office.Interop.Excel.Range包裹&所述; /参数>
朋友的Sub New(BYVAL nativeCell作为Microsoft.Office.Interop.Excel.Range)
    _nativeCell = nativeCell
结束小组

公共只读属性NativeCell()作为Microsoft.Office.Interop.Excel.Range器具ICellule.NativeCell
    得到
        返回_nativeCell
    最终获取
高端物业

公共只读属性栏()作为整数器具ICell.Column
    得到
        返回_nativeCell.Column
    最终获取
高端物业

公共只读属性ROW()作为整数器具ICell.Row
    得到
        返回_nativeCell.Row
    最终获取
高端物业

公共只读属性名称()作为字符串实现ICellule.Name
    得到
        如果(String.IsNullOrEmpty(_name)OrElse运算_name.Trim()。长度= 0)然后_
            _name = getColumnName()所

        返回_nom
    最终获取
高端物业

公共属性值()作为对象实现ICellule.Value
    得到
        返回_nativeCell.Value2
    最终获取
    设置(BYVAL值作为对象)
        _nativeCell.Value2 =价值
    结束设定
高端物业

公共只读属性FormattedValue的()作为字符串实现ICellule.FormattedValue
    得到
        返回_nativeCell.Text
    最终获取
高端物业

公共只读属性NumericValue()作为双?实现ICellule.NumericValue
    得到
        返回值
    最终获取
高端物业
 

  

问题

我有什么其他选择?

还有没有其他的方法可以穿行?

有没有一种办法可以让实际可行的方法对性能的关注?

有关您的信息,这个问题超时的测试,所以测试从来没有在可接受的时间范围内结束,实际上需要几百年......

有什么想法,欢迎!我很开放的态度与其他解决方案或方法,这将有助于我实现这个目标,而解决这一性能问题。

感谢大家! =)

  

编辑#1

由于马克西姆Gueivandov ,他的解决方案,解决了我在这一问题已经解决了这个问题。

除此之外,还有从这种解决方案出现了另一个问题: SystemOutOfMemoryException ,这将在另一个问题得到解决。

我最诚挚的感谢马克西姆Gueivandov。

解决方案

您可以尝试让所有细胞在使用范围内的单跳,从而避免调用单元(rowIndex位置,colIndex)在迭代的每次迭代(我猜细胞隐藏的互操作调用,这可能会影响性能)。

 昏暗usedRange由于范围= nativeSheet.UsedRange
昏暗的细胞(,)为对象= DirectCast(usedRange.get_Value(_
    XlRangeValueDataType.xlRangeValueDefault),对象(,))
[...做你的行/列迭代...]
 

您会发现这是我根据下面的文章,这些假设的一些性能提示: C#的Excel互操作使用 。最值得注意的是,检查基准部分:

  

===在C#中的Excel互操作基准===

     

细胞[]:30.0秒

     

get_Range(),细胞[]:15.0秒

     

UsedRange,的get_value():1.5秒   [最快]

Summary

This question is somehow the follow-up to this question: How to implement column self-naming from its index?

Having tested the code provided in this above-linked question's answers, I finally encountered a serious performance issue.

Performance issue

The performance issue occurs upon a Sheet initialization, that is, when I initialize the Sheet's Cells.

    ''' <summary>
    ''' Initialize an instance of the Company.Project.Sheet class.
    ''' </summary>
    ''' <param name="nativeSheet">The native worksheet from which to initialize.</param>
    Friend Sub New(ByVal nativeSheet As Microsoft.Office.Interop.Excel.Worksheet)
        _nativeSheet = nativeSheet
        Dim cells As IDictionary(Of String, ICell) = New Dictionary(Of String, ICell)()

        'These iterations hurt the performance of the API...'
        For rowIndex As Integer = 1 To _nativeSheet.Rows.Count Step 1
            For colIndex As Integer = 1 To _nativeSheet.Columns.Count Step 1
                Dim c As ICell = New Cell(_nativeSheet.Cells(rowIndex, colIndex))
                cellules.Add(c.Name, c)
            Next
        Next

        _cellules = New ReadOnlyDictionary(Of String, ICell)(cells)
    End Sub

ReadOnlyDictionary(Of TKey, TValue) : A custom read-only dictionary that simply wraps a IDictionary(Of TKey, TValue) to prevent modifications.

Discussion

I'm working this way since each cell in an underlying spreadsheet worksheet is initialized from the initialization of the worksheet until the end, that is, when the worksheet is disposed or finalized. Hence, the same way I wish to initialize the cells of a Sheet, but I also wish to keep the performance boost of using the indexed cells over the named ("A1") cells, while keeping the ease of use to the API user to refer to a cell with its name, that is how I intend to use the dictionary, so that when I refer to cell "A1", I access this key into my dictionary and address the cell (1, 1) accordingly.

Aside, I know of an even faster way to read from a worksheet using the Worksheet.UsedRange property that returns all of the used cells into a 2D matrix. If there was anyhow the same or about the same for the set of cells with which I could initialize multiple instances of my Cell class with, this would be great, and performant!

I also thought of initializing like only a 100 x 100 matrix cells in memory while mapping them with my dictionary, as one will rarely use the whole sheet's cells. As such, I am still thinking of a way where I would have to access a not yet initialized cell, let's say Cells(120, 120). Ideally, I think, the program would have to initialize all the cells between the maximum initially initialized Cell(100, 100) until Cell (120, 120). Am I clear enough here? Feel free to ask for clarification! =)

Another option could be that I only initialize the cells' names into the dictionary and keeping there row and column index in memory, not initializing a Cell instance with its nativeCell, say a Range. Here's the code of my Cell class to illustrate what I mean.

''' ''' Represents a cell in a worksheet. ''' ''' Friend Class Cell Implements ICell

Private _nativeCell As Microsoft.Office.Interop.Excel.Range
Private _name As String

''' <summary>
''' Initializes a new instance of the Company.Project.Cell class.
''' </summary>
''' <param name="nativeCell">The Microsoft.Office.Interop.Excel.Range to wrap.</param>
Friend Sub New(ByVal nativeCell As Microsoft.Office.Interop.Excel.Range)
    _nativeCell = nativeCell
End Sub

Public ReadOnly Property NativeCell() As Microsoft.Office.Interop.Excel.Range Implements ICellule.NativeCell
    Get
        Return _nativeCell 
    End Get
End Property

Public ReadOnly Property Column() As Integer Implements ICell.Column
    Get
        Return _nativeCell.Column
    End Get
End Property

Public ReadOnly Property Row() As Integer Implements ICell.Row
    Get
        Return _nativeCell.Row
    End Get
End Property

Public ReadOnly Property Name() As String Implements ICellule.Name
    Get
        If (String.IsNullOrEmpty(_name) OrElse _name.Trim().Length = 0) Then _
            _name = GetColumnName()

        Return _nom
    End Get
End Property

Public Property Value() As Object Implements ICellule.Value
    Get
        Return _nativeCell.Value2
    End Get
    Set(ByVal value As Object)
        _nativeCell.Value2 = value
    End Set
End Property

Public ReadOnly Property FormattedValue() As String Implements ICellule.FormattedValue
    Get
        Return _nativeCell.Text
    End Get
End Property

Public ReadOnly Property NumericValue() As Double? Implements ICellule.NumericValue
    Get
        Return Value
    End Get
End Property

Questions

What are my other options?

Are there any other ways to walk through?

Is there a way I can make the actual approach viable as for performance concerns?

For your information, this issue timed out on testing, so the test never ended within an acceptable time range which actually take centuries...

Any thoughts are welcome! I'm open minded to other solutions or approach that will help me achieve this objective while addressing this performance issue.

Thanks to you all! =)

EDIT #1

Thanks to Maxim Gueivandov, his solution solves the issue I have addressed in this question.

Aside, there's another problem that arose from this solution: SystemOutOfMemoryException, and that will be addressed in another question.

My Sincerest Thanks to Maxim Gueivandov.

解决方案

You could try to get all cells in the used range in one hop, thus avoiding to call Cells(rowIndex, colIndex) on each iteration of iteration (I guess that Cells hides an interop call, which may have a performance impact).

Dim usedRange As Range = nativeSheet.UsedRange
Dim cells(,) As Object = DirectCast(usedRange.get_Value( _
    XlRangeValueDataType.xlRangeValueDefault), Object(,))
[... do your row/col iterations ...]

You'll find some performance tips on which I based these assumptions in the following article: C# Excel Interop Use. Most notably, check the benchmark part:

=== Excel interop benchmark in C# ===

Cells[]: 30.0 seconds

get_Range(), Cells[]: 15.0 seconds

UsedRange, get_Value(): 1.5 seconds [fastest]

阅读全文

相关推荐

最新文章