Quickly creating a sparse arrayAddition of sparse array objectsManipulating sparse array elementsDevising a...

Can you tell from a blurry photo if focus was too close or too far?

Why zero tolerance on nudity in space?

Quickly creating a sparse array

Do authors have to be politically correct in article-writing?

Words and Words with "ver-" Prefix

How much mayhem could I cause as a sentient fish?

Can a Spectator be a bodyguard? So, can its treasure/item to guard be a person/person's item?

Numbers with a minus sign in a matrix not aligned with the numbers wihtout minus sign

Move fast ...... Or you will lose

Why did Democrats in the Senate oppose the Born-Alive Abortion Survivors Protection Act (2019 S.130)?

Alien invasion to probe us, why?

Dilemma of explaining to interviewer that he is the reason for declining second interview

Why exactly do action photographers need high fps burst cameras?

How can I get my players to come to the game session after agreeing to a date?

Difference between i++ and (i)++ in C

Why publish a research paper when a blog post or a lecture slide can have more citation count than a journal paper?

How to use Mathematica to do a complex integrate with poles in real axis?

A Missing Symbol for This Logo

Why was Lupin comfortable with saying Voldemort's name?

Why is Agricola named as such?

How to make ice magic work from a scientific point of view?

Consequences of lack of rigour

How to deal with an incendiary email that was recalled

Create a Price Tag Icon with Rounded Corners



Quickly creating a sparse array


Addition of sparse array objectsManipulating sparse array elementsDevising a sparse array ruleParallelizing sparse array constructionsparse array with listNon-numeric elements in banded sparse arrayAdding two SparseArrays produces zeros in the reported “NonzeroValues”SparseArray: accesing nonzero row and column entriesParallelize the construction of sparse matricesHow to construct a time-dependent matrix quickly?













4












$begingroup$


The motivation for my question is that I'm creating very large, very sparse symmetric matrices and finding the largest eigenvalue. (Think upwards of $10^5times 10^5$ size matrices, with at most $10^2$ nonzero entries on each row. However, to give a workable example, below I'll use $10^4times 10^4$ size matrices, with $30$ nonzero entries.) The entries are mostly independent of one another, although there is some time savings in working with one row at a time. (In the example below, I model this type of behavior by having the constant $c$ in the computation depend only on the row index. In the actual work I do, the dependence isn't quite that simple.) By symmetry, we may focus on the part of the matrix above the main diagonal. (For simplicity, I'm assuming the main diagonal is zero in the example below.)



So, to give a toy example consider the following code:



M = Table[0.0, {i, 1, 10000}, {j, 1, 10000}];
Do[
c = RandomInteger[{1, 10}];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
M[[PossibleColumns[[j]], i]] = c/RandomInteger[{1, 10}];
M[[i, PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
],
{i, 1, 9999}
];
Eigenvalues[M, 1];


The creation of the large array takes a couple seconds, and seems to fill up about 1 GB of RAM. (For the larger matrices I want to build, I quickly run out of RAM.) It takes about 3 seconds to update the nonzero entries in the matrix $M$. Finally, the eigenvalue function takes about 3 minutes.



If I replace the top line with the new line



M = SparseArray[{}, {10000, 10000}, 0.0];


and rerun the entire code, then there are no RAM problems (even for much, much larger matrices). The eigenvalue step is almost instantaneous. However, the updating takes over an hour! I was very surprised that the updating takes so long, so I made the following intermediate version, which rather than updating each entry of $M$ independently, it updates each row of $M$ once.



M = SparseArray[{}, {10000, 10000}, 0.0];
Do[
c = RandomInteger[{1, 10}];
jRow = SparseArray[{}, {10000}, 0.0];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
jRow[[PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
];
M[[i]] = jRow,
{i, 1, 9999}
];
M = M + Transpose[M];
Eigenvalues[M, 1];


This has the RAM savings, and the eigenvalue savings, and finishes the updating of $M$ in a little over 4.5 minutes.



My question is this: Is there a way to keep (most of) the RAM savings of using a sparse array, along with the time saving in the eigenvalue computation, but also reduce the time used in updating $M$ down to the timing in the first example (which only used 3 seconds)?










share|improve this question









$endgroup$








  • 2




    $begingroup$
    If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
    $endgroup$
    – MarcoB
    1 hour ago










  • $begingroup$
    @MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
    $endgroup$
    – Pace Nielsen
    1 hour ago












  • $begingroup$
    Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
    $endgroup$
    – MarcoB
    1 hour ago








  • 2




    $begingroup$
    Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
    $endgroup$
    – Szabolcs
    58 mins ago
















4












$begingroup$


The motivation for my question is that I'm creating very large, very sparse symmetric matrices and finding the largest eigenvalue. (Think upwards of $10^5times 10^5$ size matrices, with at most $10^2$ nonzero entries on each row. However, to give a workable example, below I'll use $10^4times 10^4$ size matrices, with $30$ nonzero entries.) The entries are mostly independent of one another, although there is some time savings in working with one row at a time. (In the example below, I model this type of behavior by having the constant $c$ in the computation depend only on the row index. In the actual work I do, the dependence isn't quite that simple.) By symmetry, we may focus on the part of the matrix above the main diagonal. (For simplicity, I'm assuming the main diagonal is zero in the example below.)



So, to give a toy example consider the following code:



M = Table[0.0, {i, 1, 10000}, {j, 1, 10000}];
Do[
c = RandomInteger[{1, 10}];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
M[[PossibleColumns[[j]], i]] = c/RandomInteger[{1, 10}];
M[[i, PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
],
{i, 1, 9999}
];
Eigenvalues[M, 1];


The creation of the large array takes a couple seconds, and seems to fill up about 1 GB of RAM. (For the larger matrices I want to build, I quickly run out of RAM.) It takes about 3 seconds to update the nonzero entries in the matrix $M$. Finally, the eigenvalue function takes about 3 minutes.



If I replace the top line with the new line



M = SparseArray[{}, {10000, 10000}, 0.0];


and rerun the entire code, then there are no RAM problems (even for much, much larger matrices). The eigenvalue step is almost instantaneous. However, the updating takes over an hour! I was very surprised that the updating takes so long, so I made the following intermediate version, which rather than updating each entry of $M$ independently, it updates each row of $M$ once.



M = SparseArray[{}, {10000, 10000}, 0.0];
Do[
c = RandomInteger[{1, 10}];
jRow = SparseArray[{}, {10000}, 0.0];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
jRow[[PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
];
M[[i]] = jRow,
{i, 1, 9999}
];
M = M + Transpose[M];
Eigenvalues[M, 1];


This has the RAM savings, and the eigenvalue savings, and finishes the updating of $M$ in a little over 4.5 minutes.



My question is this: Is there a way to keep (most of) the RAM savings of using a sparse array, along with the time saving in the eigenvalue computation, but also reduce the time used in updating $M$ down to the timing in the first example (which only used 3 seconds)?










share|improve this question









$endgroup$








  • 2




    $begingroup$
    If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
    $endgroup$
    – MarcoB
    1 hour ago










  • $begingroup$
    @MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
    $endgroup$
    – Pace Nielsen
    1 hour ago












  • $begingroup$
    Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
    $endgroup$
    – MarcoB
    1 hour ago








  • 2




    $begingroup$
    Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
    $endgroup$
    – Szabolcs
    58 mins ago














4












4








4


1



$begingroup$


The motivation for my question is that I'm creating very large, very sparse symmetric matrices and finding the largest eigenvalue. (Think upwards of $10^5times 10^5$ size matrices, with at most $10^2$ nonzero entries on each row. However, to give a workable example, below I'll use $10^4times 10^4$ size matrices, with $30$ nonzero entries.) The entries are mostly independent of one another, although there is some time savings in working with one row at a time. (In the example below, I model this type of behavior by having the constant $c$ in the computation depend only on the row index. In the actual work I do, the dependence isn't quite that simple.) By symmetry, we may focus on the part of the matrix above the main diagonal. (For simplicity, I'm assuming the main diagonal is zero in the example below.)



So, to give a toy example consider the following code:



M = Table[0.0, {i, 1, 10000}, {j, 1, 10000}];
Do[
c = RandomInteger[{1, 10}];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
M[[PossibleColumns[[j]], i]] = c/RandomInteger[{1, 10}];
M[[i, PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
],
{i, 1, 9999}
];
Eigenvalues[M, 1];


The creation of the large array takes a couple seconds, and seems to fill up about 1 GB of RAM. (For the larger matrices I want to build, I quickly run out of RAM.) It takes about 3 seconds to update the nonzero entries in the matrix $M$. Finally, the eigenvalue function takes about 3 minutes.



If I replace the top line with the new line



M = SparseArray[{}, {10000, 10000}, 0.0];


and rerun the entire code, then there are no RAM problems (even for much, much larger matrices). The eigenvalue step is almost instantaneous. However, the updating takes over an hour! I was very surprised that the updating takes so long, so I made the following intermediate version, which rather than updating each entry of $M$ independently, it updates each row of $M$ once.



M = SparseArray[{}, {10000, 10000}, 0.0];
Do[
c = RandomInteger[{1, 10}];
jRow = SparseArray[{}, {10000}, 0.0];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
jRow[[PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
];
M[[i]] = jRow,
{i, 1, 9999}
];
M = M + Transpose[M];
Eigenvalues[M, 1];


This has the RAM savings, and the eigenvalue savings, and finishes the updating of $M$ in a little over 4.5 minutes.



My question is this: Is there a way to keep (most of) the RAM savings of using a sparse array, along with the time saving in the eigenvalue computation, but also reduce the time used in updating $M$ down to the timing in the first example (which only used 3 seconds)?










share|improve this question









$endgroup$




The motivation for my question is that I'm creating very large, very sparse symmetric matrices and finding the largest eigenvalue. (Think upwards of $10^5times 10^5$ size matrices, with at most $10^2$ nonzero entries on each row. However, to give a workable example, below I'll use $10^4times 10^4$ size matrices, with $30$ nonzero entries.) The entries are mostly independent of one another, although there is some time savings in working with one row at a time. (In the example below, I model this type of behavior by having the constant $c$ in the computation depend only on the row index. In the actual work I do, the dependence isn't quite that simple.) By symmetry, we may focus on the part of the matrix above the main diagonal. (For simplicity, I'm assuming the main diagonal is zero in the example below.)



So, to give a toy example consider the following code:



M = Table[0.0, {i, 1, 10000}, {j, 1, 10000}];
Do[
c = RandomInteger[{1, 10}];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
M[[PossibleColumns[[j]], i]] = c/RandomInteger[{1, 10}];
M[[i, PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
],
{i, 1, 9999}
];
Eigenvalues[M, 1];


The creation of the large array takes a couple seconds, and seems to fill up about 1 GB of RAM. (For the larger matrices I want to build, I quickly run out of RAM.) It takes about 3 seconds to update the nonzero entries in the matrix $M$. Finally, the eigenvalue function takes about 3 minutes.



If I replace the top line with the new line



M = SparseArray[{}, {10000, 10000}, 0.0];


and rerun the entire code, then there are no RAM problems (even for much, much larger matrices). The eigenvalue step is almost instantaneous. However, the updating takes over an hour! I was very surprised that the updating takes so long, so I made the following intermediate version, which rather than updating each entry of $M$ independently, it updates each row of $M$ once.



M = SparseArray[{}, {10000, 10000}, 0.0];
Do[
c = RandomInteger[{1, 10}];
jRow = SparseArray[{}, {10000}, 0.0];
PossibleColumns = RandomSample[Range[i + 1, 10000], Min[10000 - i - 1, 30]];
Do[
jRow[[PossibleColumns[[j]]]] = c/RandomInteger[{1, 10}],
{j, 1, Length[PossibleColumns]}
];
M[[i]] = jRow,
{i, 1, 9999}
];
M = M + Transpose[M];
Eigenvalues[M, 1];


This has the RAM savings, and the eigenvalue savings, and finishes the updating of $M$ in a little over 4.5 minutes.



My question is this: Is there a way to keep (most of) the RAM savings of using a sparse array, along with the time saving in the eigenvalue computation, but also reduce the time used in updating $M$ down to the timing in the first example (which only used 3 seconds)?







sparse-arrays






share|improve this question













share|improve this question











share|improve this question




share|improve this question










asked 2 hours ago









Pace NielsenPace Nielsen

1805




1805








  • 2




    $begingroup$
    If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
    $endgroup$
    – MarcoB
    1 hour ago










  • $begingroup$
    @MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
    $endgroup$
    – Pace Nielsen
    1 hour ago












  • $begingroup$
    Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
    $endgroup$
    – MarcoB
    1 hour ago








  • 2




    $begingroup$
    Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
    $endgroup$
    – Szabolcs
    58 mins ago














  • 2




    $begingroup$
    If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
    $endgroup$
    – MarcoB
    1 hour ago










  • $begingroup$
    @MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
    $endgroup$
    – Pace Nielsen
    1 hour ago












  • $begingroup$
    Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
    $endgroup$
    – MarcoB
    1 hour ago








  • 2




    $begingroup$
    Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
    $endgroup$
    – Szabolcs
    58 mins ago








2




2




$begingroup$
If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
$endgroup$
– MarcoB
1 hour ago




$begingroup$
If you can express the values of the non-zero terms as a function of the position indices, you should be able to construct the SparseArray from the ground up, by giving a list of rules, or by using patterns and conditions. That should be MUCH better than modifying them element-wise.
$endgroup$
– MarcoB
1 hour ago












$begingroup$
@MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
$endgroup$
– Pace Nielsen
1 hour ago






$begingroup$
@MarcoB Yes, I have a function $f(i,j)$ that gives the value of the matrix in the $(i,j)$ coordinate. Moreover, I have a function $g(i)$ that gives me a list of those $j$ such that the $(i,j)$ entry is nonzero. (In the toy example above, you would replace "PossibleColumns" by $g(i)$, and you would replace "c/RandomInteger[{1,10}]" with $f(i,j)$.) I don't know how to turn that into a faster population of the sparse array though; any help with the syntax would be appreciated!
$endgroup$
– Pace Nielsen
1 hour ago














$begingroup$
Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
$endgroup$
– MarcoB
1 hour ago






$begingroup$
Can you share those functions, perhaps expressed both in plain language, and in Mathematica code? Just in case one of the two is more readable than the other.
$endgroup$
– MarcoB
1 hour ago






2




2




$begingroup$
Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
$endgroup$
– Szabolcs
58 mins ago




$begingroup$
Changing elements of sparse arrays is slow because the whole array must be re-built after each change. Do not assign elements in a loop!! Instead, generate a list like {i,j} -> value first, then built the sparse array in a single step.
$endgroup$
– Szabolcs
58 mins ago










1 Answer
1






active

oldest

votes


















5












$begingroup$

Let's assume we are given the following data:



n = 100000;
k = 100;

g[i_] := RandomSample[i + 1 ;; n, Min[n - i - 1, k]];
f[i_, jlist_] := RandomReal[{1, 10}, Length[jlist]];

rowcolumnindices = g /@ Range[1, n - 1];
rowlengths = Length /@ rowcolumnindices;
rowvalues = MapIndexed[
{jlist, i} [Function] f[i[[1]], jlist],
rowcolumnindices
];


Probably the most efficient way to populate the sparse array with this data is by preparing the CRS data by hand and to use the following undocumented constructor:



First @ AbsoluteTiming[
orderings = Ordering /@ rowcolumnindices;
columnindices = Partition[Join @@ MapThread[Part, {rowcolumnindices, orderings}], 1];
rowpointers = Accumulate[Join[{0}, Length /@ rowcolumnindices, {0}]];
values = Join @@ MapThread[Part, {rowvalues, orderings}];
M = # + Transpose[#] &[
SparseArray @@ {Automatic, {n, n}, 0., {1, {rowpointers, columnindices}, values}}
];
]



1.53083




I used machine real values here (not rational numbers) because they will be processed much faster, in particular in Eigenvalues.



The CRS format assumes that the column indices for each row are in ascending order. This is why we have to sort the column indices and the corresponding values first (this explains the extra fuzz around orderings).






share|improve this answer











$endgroup$













  • $begingroup$
    In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
    $endgroup$
    – Pace Nielsen
    30 mins ago






  • 1




    $begingroup$
    Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
    $endgroup$
    – Henrik Schumacher
    19 mins ago






  • 1




    $begingroup$
    This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
    $endgroup$
    – Pace Nielsen
    15 mins ago











Your Answer





StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["$", "$"], ["\\(","\\)"]]);
});
});
}, "mathjax-editing");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "387"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
autoActivateHeartbeat: false,
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});


}
});














draft saved

draft discarded


















StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192328%2fquickly-creating-a-sparse-array%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown

























1 Answer
1






active

oldest

votes








1 Answer
1






active

oldest

votes









active

oldest

votes






active

oldest

votes









5












$begingroup$

Let's assume we are given the following data:



n = 100000;
k = 100;

g[i_] := RandomSample[i + 1 ;; n, Min[n - i - 1, k]];
f[i_, jlist_] := RandomReal[{1, 10}, Length[jlist]];

rowcolumnindices = g /@ Range[1, n - 1];
rowlengths = Length /@ rowcolumnindices;
rowvalues = MapIndexed[
{jlist, i} [Function] f[i[[1]], jlist],
rowcolumnindices
];


Probably the most efficient way to populate the sparse array with this data is by preparing the CRS data by hand and to use the following undocumented constructor:



First @ AbsoluteTiming[
orderings = Ordering /@ rowcolumnindices;
columnindices = Partition[Join @@ MapThread[Part, {rowcolumnindices, orderings}], 1];
rowpointers = Accumulate[Join[{0}, Length /@ rowcolumnindices, {0}]];
values = Join @@ MapThread[Part, {rowvalues, orderings}];
M = # + Transpose[#] &[
SparseArray @@ {Automatic, {n, n}, 0., {1, {rowpointers, columnindices}, values}}
];
]



1.53083




I used machine real values here (not rational numbers) because they will be processed much faster, in particular in Eigenvalues.



The CRS format assumes that the column indices for each row are in ascending order. This is why we have to sort the column indices and the corresponding values first (this explains the extra fuzz around orderings).






share|improve this answer











$endgroup$













  • $begingroup$
    In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
    $endgroup$
    – Pace Nielsen
    30 mins ago






  • 1




    $begingroup$
    Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
    $endgroup$
    – Henrik Schumacher
    19 mins ago






  • 1




    $begingroup$
    This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
    $endgroup$
    – Pace Nielsen
    15 mins ago
















5












$begingroup$

Let's assume we are given the following data:



n = 100000;
k = 100;

g[i_] := RandomSample[i + 1 ;; n, Min[n - i - 1, k]];
f[i_, jlist_] := RandomReal[{1, 10}, Length[jlist]];

rowcolumnindices = g /@ Range[1, n - 1];
rowlengths = Length /@ rowcolumnindices;
rowvalues = MapIndexed[
{jlist, i} [Function] f[i[[1]], jlist],
rowcolumnindices
];


Probably the most efficient way to populate the sparse array with this data is by preparing the CRS data by hand and to use the following undocumented constructor:



First @ AbsoluteTiming[
orderings = Ordering /@ rowcolumnindices;
columnindices = Partition[Join @@ MapThread[Part, {rowcolumnindices, orderings}], 1];
rowpointers = Accumulate[Join[{0}, Length /@ rowcolumnindices, {0}]];
values = Join @@ MapThread[Part, {rowvalues, orderings}];
M = # + Transpose[#] &[
SparseArray @@ {Automatic, {n, n}, 0., {1, {rowpointers, columnindices}, values}}
];
]



1.53083




I used machine real values here (not rational numbers) because they will be processed much faster, in particular in Eigenvalues.



The CRS format assumes that the column indices for each row are in ascending order. This is why we have to sort the column indices and the corresponding values first (this explains the extra fuzz around orderings).






share|improve this answer











$endgroup$













  • $begingroup$
    In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
    $endgroup$
    – Pace Nielsen
    30 mins ago






  • 1




    $begingroup$
    Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
    $endgroup$
    – Henrik Schumacher
    19 mins ago






  • 1




    $begingroup$
    This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
    $endgroup$
    – Pace Nielsen
    15 mins ago














5












5








5





$begingroup$

Let's assume we are given the following data:



n = 100000;
k = 100;

g[i_] := RandomSample[i + 1 ;; n, Min[n - i - 1, k]];
f[i_, jlist_] := RandomReal[{1, 10}, Length[jlist]];

rowcolumnindices = g /@ Range[1, n - 1];
rowlengths = Length /@ rowcolumnindices;
rowvalues = MapIndexed[
{jlist, i} [Function] f[i[[1]], jlist],
rowcolumnindices
];


Probably the most efficient way to populate the sparse array with this data is by preparing the CRS data by hand and to use the following undocumented constructor:



First @ AbsoluteTiming[
orderings = Ordering /@ rowcolumnindices;
columnindices = Partition[Join @@ MapThread[Part, {rowcolumnindices, orderings}], 1];
rowpointers = Accumulate[Join[{0}, Length /@ rowcolumnindices, {0}]];
values = Join @@ MapThread[Part, {rowvalues, orderings}];
M = # + Transpose[#] &[
SparseArray @@ {Automatic, {n, n}, 0., {1, {rowpointers, columnindices}, values}}
];
]



1.53083




I used machine real values here (not rational numbers) because they will be processed much faster, in particular in Eigenvalues.



The CRS format assumes that the column indices for each row are in ascending order. This is why we have to sort the column indices and the corresponding values first (this explains the extra fuzz around orderings).






share|improve this answer











$endgroup$



Let's assume we are given the following data:



n = 100000;
k = 100;

g[i_] := RandomSample[i + 1 ;; n, Min[n - i - 1, k]];
f[i_, jlist_] := RandomReal[{1, 10}, Length[jlist]];

rowcolumnindices = g /@ Range[1, n - 1];
rowlengths = Length /@ rowcolumnindices;
rowvalues = MapIndexed[
{jlist, i} [Function] f[i[[1]], jlist],
rowcolumnindices
];


Probably the most efficient way to populate the sparse array with this data is by preparing the CRS data by hand and to use the following undocumented constructor:



First @ AbsoluteTiming[
orderings = Ordering /@ rowcolumnindices;
columnindices = Partition[Join @@ MapThread[Part, {rowcolumnindices, orderings}], 1];
rowpointers = Accumulate[Join[{0}, Length /@ rowcolumnindices, {0}]];
values = Join @@ MapThread[Part, {rowvalues, orderings}];
M = # + Transpose[#] &[
SparseArray @@ {Automatic, {n, n}, 0., {1, {rowpointers, columnindices}, values}}
];
]



1.53083




I used machine real values here (not rational numbers) because they will be processed much faster, in particular in Eigenvalues.



The CRS format assumes that the column indices for each row are in ascending order. This is why we have to sort the column indices and the corresponding values first (this explains the extra fuzz around orderings).







share|improve this answer














share|improve this answer



share|improve this answer








edited 21 mins ago

























answered 1 hour ago









Henrik SchumacherHenrik Schumacher

55.2k575154




55.2k575154












  • $begingroup$
    In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
    $endgroup$
    – Pace Nielsen
    30 mins ago






  • 1




    $begingroup$
    Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
    $endgroup$
    – Henrik Schumacher
    19 mins ago






  • 1




    $begingroup$
    This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
    $endgroup$
    – Pace Nielsen
    15 mins ago


















  • $begingroup$
    In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
    $endgroup$
    – Pace Nielsen
    30 mins ago






  • 1




    $begingroup$
    Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
    $endgroup$
    – Henrik Schumacher
    19 mins ago






  • 1




    $begingroup$
    This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
    $endgroup$
    – Pace Nielsen
    15 mins ago
















$begingroup$
In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
$endgroup$
– Pace Nielsen
30 mins ago




$begingroup$
In the toy example I gave above, the entries are chosen randomly. In the actual computation I care about the entries are determined by the coordinates. I made the toy example different in this way for two reasons: (1) the deterministic function is very complicated, and (2) I wanted to easily create a large matrix to help illustrate the situation. So, I'm having trouble converting your beautiful ideas to my situation, because "rowvalues" is not so easily created. My problem really boils down to trying to put all that data in a list quickly, as you have done with rowvalues.
$endgroup$
– Pace Nielsen
30 mins ago




1




1




$begingroup$
Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
$endgroup$
– Henrik Schumacher
19 mins ago




$begingroup$
Yes, I assumed that you can compute the values of each row before the actual matrix is constructed. I tried to address the functions f and g in my recent edit. The only difference of my function f to yours is that it expects a whole list of js as a second argument so that it creates all values of a row at once. Does this help?
$endgroup$
– Henrik Schumacher
19 mins ago




1




1




$begingroup$
This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
$endgroup$
– Pace Nielsen
15 mins ago




$begingroup$
This looks great! I'm still trying to understand it all. I'll let you know soon if I have any further questions.
$endgroup$
– Pace Nielsen
15 mins ago


















draft saved

draft discarded




















































Thanks for contributing an answer to Mathematica Stack Exchange!


  • Please be sure to answer the question. Provide details and share your research!

But avoid



  • Asking for help, clarification, or responding to other answers.

  • Making statements based on opinion; back them up with references or personal experience.


Use MathJax to format equations. MathJax reference.


To learn more, see our tips on writing great answers.




draft saved


draft discarded














StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fmathematica.stackexchange.com%2fquestions%2f192328%2fquickly-creating-a-sparse-array%23new-answer', 'question_page');
}
);

Post as a guest















Required, but never shown





















































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown

































Required, but never shown














Required, but never shown












Required, but never shown







Required, but never shown







Popular posts from this blog

Benedict Cumberbatch Contingut Inicis Debut professional Premis Filmografia bàsica Premis i...

Monticle de plataforma Contingut Est de Nord Amèrica Interpretacions Altres cultures Vegeu...

Escacs Janus Enllaços externs Menú de navegacióEscacs JanusJanusschachBrainKing.comChessV